argoneus argoneus - 1 month ago 7
Git Question

What's the best practice with git for multiple language implementations?

So I have a few private git repositories that are different language implementations (Python, Java, etc.) of an algorithm. Each implementation is functionally identical, performing the same steps and giving the same output. Currently, these are separate repos, but I was wondering if I shouldn't unify them into one repo, with directories indicating the language, like:

master
- java
- python
- ruby


I could use a git-repo combine command to preserve the history, so that's not an issue. I was just curious as to the best practice regarding this.

Answer

I had this same question with Mercurial, and an algorithm (COBS) that I wanted to implement in C and Python.

Eventually I decided to split it into separate repositories (even though the Python implementation included a C extension that had similar code to the plain C implementation). My reasoning came down to:

  • I wanted to have independent version numbering of the implementations, and independent releases.
    • git describe is a nice feature to identify a version based on the latest annotated tag. With just one implementation in the repository, git describe usage is simple. But if different implementations with separate version numbers are in the one repository, then git describe usage becomes more complicated, needing use of the --match option to limit to tags with a given prefix. e.g. git describe --match "python*"
  • The way Python modules are typically organised (Python module packaging best-practices), it made more sense to me to keep the Python implementation separate and self-contained.
  • All else being equal, I tend to favour more fine-grained modularity.