Gary's Guide To Git
Please find below some of the most popular articles I've written on Gary's Guide To Git.
Table of Contents
Building a full-text index of git commits using lunr.js and Github APIs
Converting git commit history to a solr full-text index
Expert Search Statistics
Finding Corporate Sponsors of Open Source
Building a full-text index of git commits using lunr.js and Github APIs
Github has a nice API for inspecting repositories – it lets you read gists, issues, commit history, files and so on. Git repository data lends itself to demonstrating the power of combining full text and faceted search, as there is a mix of free text fields (commit messages, code) and enumerable fields (committers, dates, committer [...] Read More...
Converting git commit history to a solr full-text index
I built a 4 million document archive from Github commits, which lets you search for open source experts, ranked by commit count. Click here to try the demo. Solr is a relatively recent addition to the world of Lucene (2007); it adds a web-app UI over lucene, scaling (highly available reads), and configuration. For those [...] Read More...
Expert Search Statistics
The following are some interesting statistics about the Github expert-finder. Unique repositories: 18,977 Source git repos (GB): 250+ GB Solr Index Size: 3.2 GB Time to build index: ~12 hours spread over several days (had to restart indexer several times) Number of commits: 4,579,236 Read More...
Finding Corporate Sponsors of Open Source
I copied about 19,000 git repositories into a full-text solr index. Because commits are tied to email addresses this provides interesting insight into corporate open source contributions. The search front-end I added lets you search for programmers or companies, grouped by the number of commits. For example, searching for Linux returns the following results: linux-foundation [...] Read More...