Gary's Guide To Git

Please find below some of the most popular articles I've written on Gary's Guide To Git.

Table of Contents

Building a full-text index of git commits using lunr.js and Github APIs

Converting git commit history to a solr full-text index

Expert Search Statistics

Finding Corporate Sponsors of Open Source



Building a full-text index of git commits using lunr.js and Github APIs

Github has a nice API for inspecting repositories – it lets you read gists, issues, commit history, files and so on. Git repository data lends itself to demonstrating the power of combining full text and faceted search, as there is a mix of free text fields (commit messages, code) and enumerable fields (committers, dates, committer [...] Read More...

Converting git commit history to a solr full-text index

I built a 4 million document archive from Github commits, which lets you search for open source experts, ranked by commit count. Click here to try the demo. Solr is a relatively recent addition to the world of Lucene (2007); it adds a web-app UI over lucene, scaling (highly available reads), and configuration. For those [...] Read More...

Expert Search Statistics

The following are some interesting statistics about the Github expert-finder. Unique repositories: 18,977 Source git repos (GB): 250+ GB Solr Index Size: 3.2 GB Time to build index: ~12 hours spread over several days (had to restart indexer several times) Number of commits:  4,579,236   Read More...

Finding Corporate Sponsors of Open Source

I copied about 19,000 git repositories into a full-text solr index. Because commits are tied to email addresses this provides interesting insight into corporate open source contributions. The search front-end I added lets you search for programmers or companies, grouped by the number of commits. For example, searching for Linux returns the following results: linux-foundation [...] Read More...