πŸ‘Ύ Check out the results of my summer hackathon, Covehack! πŸ‘Ύ

Creating my first Homebrew package with Rust

Posted February 1, 2024 β€’ 5 min read
banner image

What did I make?πŸ”—

GitHub repo & Homebrew installation

I developed a simple CLI tool in Rust that allows users to input their GitHub usernames and a generated account-specific fine-grained personal access token. It then scrapes all their private and public repositories, summarizing their total lifetime commits, lines of code added, and lines removed.

Why build this?πŸ”—

A professor at my university inquired recently in class about the total lines of code we thought we had all written in our lifetimes. No one had a definitive answer. This sparked my interest in creating a tool that could provide a quick answer. The only caveat is that the tool only analyzes code committed to GitHub. Therefore, if the majority of your code isn't checked in on GitHub, the results are largely useless.

Building the toolπŸ”—

I initiated a Rust project with cargo new ghloc, opting to use Rust because of their adorable mascot, Ferris.

Right off the bat, I delved into learning how to effectively use GitHub's API. The comprehensive documentation made it clear that I needed to focus on two main endpoints:

To pull all of a user's repositories: (needing GITHUB_TOKEN & PAGE_NUMBER)

curl -H "Accept: application/vnd.github.v3+json" \
     -H "Authorization: token GITHUB_TOKEN" \
     "https://api.github.com/user/repos?type=all&per_page=100&page=PAGE_NUMBER" 

Listing repository metadata: (needing GITHUB_TOKEN, USERNAME, & REPO_NAME)

curl -H "Accept: application/vnd.github.v3+json" \
     -H "Authorization: token GITHUB_TOKEN" \
     "https://api.github.com/repos/USERNAME/REPO_NAME/stats/contributors"

My primary concerns were rate limiting and acquiring the GITHUB_TOKEN. The former became a minor issue after consulting the documentation, which assured me of the lenient limits (mainly 5000 requests per hour for an authenticated user). Moreover, obtaining the token was straightforward, requiring just a few steps.

Next, I went to work fetching all the user's repos using the first endpoint, iterating through pages until the retrieved results were fewer than requested. Then, for each repository, I utilized the second endpoint.

This process was slightly complex because GitHub initially returns a 202 ACCEPTED status with an empty response for repository statistics not already cached, indicating a calculation process in the background.

To address this, I employed tokio to asynchronously poll each repository's endpoint using separate threads n times until a 200 OK status was returned with the desired data. To be cautious, I implemented progressive delays between each attempt to avoid triggering 429 or 403 responses.

For a smooth user experience, I synchronized the polling attempts across all threads, compiling their results before presenting the final output. This was achieved by tracking each thread's handle (of type JoinHandle<RepoContributions>) returned from the fetch_contributions(...) function, which I later collectively resolved:

// ops.rs

let mut handles = Vec::new(); // Storing handles for each attempt

for repo_name in &user_repos {
    println!("Fetching data for {}", repo_name.name);
    let handle = fetch_contributors(&token, &repo_name.name, &username);
    handles.push(handle);
}

... and then I aggregated the results upon completion of all tasks simultaneously:

// ops.rs

// Proceeding once all attempts are completed
let results = join_all(handles).await;

/// data aggregation here...

This approach proved to be highly effective.

The final step was to enhance the tool's CLI usability. I utilized the clap crate to quickly establish a structure for the tool's arguments. Here's a snippet showing how simple the third-party crate is:

// main.rs

let matches = App::new(config::APP_NAME)
    .version(format!("v{}", env!("CARGO_PKG_VERSION")).as_str())
    .author("Matthew Trent β€’ matthewtrent.me")
    .about("Fetches your total LOC and commits from all your GitHub repos
For info on how to get a GitHub token, check the README.md at: github.com/mattrltrent/ghloc")
    .subcommand(SubCommand::with_name("creds").about("Displays the current GitHub credentials"))
    .subcommand(SubCommand::with_name("example").about("Shows an example of usage"))

// and so on...

This allowed for intuitive commands like ghloc creds or ghloc stats, displaying results directly in the shell.

Publishing with HomebrewπŸ”—

Configuring Homebrew was somewhat challenging, so I'll skip the extensive debugging parts (I'm looking at you, XCode errors).

To start, I created a separate GitHub repository named homebrew-tap (this exact name was required), containing a Formula directory with a single ghloc.rb file:

# ghloc.rb

class Ghloc < Formula
    desc "Fetches your total LOC and commits from all your GitHub repos"
    homepage "https://github.com/mattrltrent/ghloc"
    url "https://github.com/mattrltrent/ghloc/archive/v1.0.0.tar.gz"
    sha256 "7f1578dac8f0a8d620f39352924f0085932404141516c4780a9bee7efa4c8b0a"
  
    depends_on "rust" => :build
  
    def install
      system "cargo", "install", *std_cargo_args
    end
  
    test do
      # this test runs `ghloc --version` and checks that it starts with "ghloc "
      assert_match /^ghloc /, shell_output("#{bin}/ghloc --version")
    end
  end

This Ruby script included the necessary metadata and installation commands using cargo. A test was also included to ensure proper binary installation.

For the url field, I created a GitHub release and used the .tar.gz file from this release (AKA: a link that points to the file in the release matching the version of the .tar.gz file). To get that .tar.gz file I ran the command (the version number here should match with the one on the GitHub release):

tar --exclude='./target' -czvf ghloc-1.0.0.tar.gz . 

Then, to obtain the sha256 value, I executed:

shasum -a 256 ghloc-1.0.0.tar.gz 

... which provided the required hash.

Finally, after setting up the desc and homepage fields (these were intuitive), I was able to install the package locally on my Macbook (and anywhere now!) using:

brew tap mattrltrent/tap ; brew install ghloc

This enabled me to run my Rust package locally, which was super neat; you should try it out!

If you've read this far, a star on GitHub would be appreciated πŸ˜‰