What is the difference between Git and CVS version control systems?
I have been happily using CVS for over 10 years, and now I have been told that Git is much better. Could someone please explain what the difference between the two is, and why one is better than the other?
The main difference is that (as it was already said in other responses) CVS is (old) centralized version control system, while Git is distributed.
But even if you use version control for single developer, on single machine (single account), there are a few differences between Git and CVS:
Setting up repository. Git stores repository in
.git directory in top directory of your project; CVS require setting up CVSROOT, a central place for storing version control info for different projects (modules). The consequence of that design for user is that importing existing sources into version control is as simple as "git init && git add . && git commit" in Git, while it is more complicated in CVS.
Atomic operations. Because CVS at beginning was a set of scripts around per-file RCS version control system, commits (and other operations) are not atomic in CVS; if an operation on the repository is interrupted in the middle, the repository can be left in an inconsistent state. In Git all operations are atomic: either they succeed as whole, or they fail without any changes.
Changesets. Changes in CVS are per file, while changes (commits) in Git they always refer to the whole project. This is very important paradigm shift. One of consequences of this is that it is very easy in Git to revert (create a change that undoes) or undo whole change; other consequence is that in CVS is easy to do partial checkouts, while it is currently next to impossible in Git. The fact that changes are per-file, grouped together led to invention of GNU Changelog format for commit messages in CVS; Git users use (and some Git tools expect) different convention, with single line describing (summarizing) change, followed by empty line, followed by more detailed description of changes.
Naming revisions / version numbers. There is another issue connected with the fact that in CVS changes are per files: version numbers (as you can see sometimes in keyword expansion, see below) like 1.4 reflects how many time given file has been changed. In Git each version of a project as a whole (each commit) has its unique name given by SHA-1 id; usually first 7-8 characters are enough to identify a commit (you can't use simple numbering scheme for versions in distributed version control system -- that requires central numbering authority). In CVS to have version number or symbolic name referring to state of project as a whole you use tags; the same is true in Git if you want to use name like 'v1.5.6-rc2' for some version of a project... but tags in Git are much easier to use.
Easy branching. Branches in CVS are in my opinion overly complicated, and hard to deal with. You have to tag branches to have a name for a whole repository branch (and even that can fail in some cases, if I remember correctly, because of per-file handling). Add to that the fact that CVS doesn't have merge tracking, so you have to either remember, or manually tag merges and branching points, and manually supply correct info for "cvs update -j" to merge branches, and it makes for branching to be unnecessary hard to use. In Git creating and merging branches is very easy; Git remembers all required info by itself (so merging a branch is as easy as "git merge branchname")... it had to, because distributed development naturally leads to multiple branches.
This means that you are able to use topic branches, i.e. develop a separate feature in multiple steps in separate feature branch.
Rename (and copy) tracking. File renames are not supported in CVS, and manual renaming might break history in two, or lead to invalid history where you cannot correctly recover the state of a project before rename. Git uses heuristic rename detection, based on similarity of contents and filename (This solution works well in practice). You can also request detecting of copying of files. This means that:
Binary files. CVS has only a very limited support for binary files (e.g. images), requiring users to mark binary files explicitly when adding (or later using "cvs admin", or via wrappers to do that automatically based on file name), to avoid mangling of binary file via end-of-line conversion and keyword expansion. Git automatically detects binary file based on contents in the same way CNU diff and other tools do it; you can override this detection using gitattributes mechanism. Moreover binary files are safe against unrecoverable mangling thanks to default on 'safecrlf' (and the fact that you have to request end-of-line conversion, although this might be turned on by default depending on distribution), and that (limited) keyword expansion is a strict 'opt-in' in Git.
Keyword expansion. Git offers a very, very limited set of keywords as compared to CVS (by default). This is because of two facts: changes in Git are per repository and not per file, and Git avoids modifying files that did not change when switching to other branch or rewinding to other point in history. If you want to embed revision number using Git, you should do this using your build system, e.g. following example of GIT-VERSION-GEN script in Linux kernel sources and in Git sources.
Amending commits. Because in distributed VCS such as Git act of publishing is separate from creating a commit, one can change (edit, rewrite) unpublished part of history without inconveniencing other users. In particular if you notice typo (or other error) in commit message, or a bug in commit, you can simply use "git commit --amend". This is not possible (at least not without heavy hackery) in CVS.
More tools. Git offers much more tools than CVS. One of more important is "git bisect" that can be used to find a commit (revision) that introduced a bug; if your commits are small and self-contained it should be fairly easy then to discover where the bug is.
If you are collaboration with at least one other developer, you would find also the following differences between Git and CVS:
Commit before merge Git uses commit-before-merge rather than, like CVS, merge-before-commit (or update-then-commit). If while you were editing files, preparing for creating new commit (new revision) somebody other created new commit on the same branch and it is now in repository, CVS forces you to first update your working directory and resolve conflicts before allowing you to commit. This is not the case with Git. You first commit, saving your state in version control, then you merge other developer changes. You can also ask the other developer to do the merge and resolve conflicts.
If you prefer to have linear history and avoid merges, you can always use commit-merge-recommit workflow via "git rebase" (and "git pull --rebase"), which is similar to CVS in that you replay your changes on top of updated state. But you always commit first.
No need for central repository With Git there is no need to have single central place where you commit your changes. Each developer can have its own repository (or better repositories: private one in which he/she does development, and public bare one where she/he publishes that part which is ready), and they can pull/fetch from each other repositories, in symmetric fashion. On the other hand it is common for larger project to have socially defined/nominated central repository from which everyone pull from (get changes from).
Finally Git offers many more possibilities when collaboration with large number of developers is needed. Below there are differences between CVS in Git for different stages of interest and position in a project (under version control using CVS or Git):
lurker. If you are interested only in getting latest changes from a project, (no propagation of your changes), or doing private development (without contributing back to original projects); or you use foreign projects as a basis of your own project (changes are local and doesn't it make sense to publish them).
Git supports here anonymous unauthenticated read-only access via custom efficient
git://protocol, or if you are behind firewall blocking
DEFAULT_GIT_PORT (9418) you can use plain HTTP.
For CVS most common solution (as I understand it) for read-only access is guest account for 'pserver' protocol on
CVS_AUTH_PORT (2401), usually called "anonymous" and with empty password. Credentials are stored by default in
$HOME/.cvspass file, so you have to provide it only once; still, this is a bit of barrier (you have to know name of guest account, or pay attention to CVS server messages) and annoyance.
fringe developer (leaf contributor). One way of propagating your changes in OSS is sending patches via email. This is most common solution if you are (more or less) accidental developer, sending single change, or single bugfix. BTW. sending patches might be via review board (patch review system) or similar means, not only via email.
Git offers here tools which help in this propagation (publishing) mechanism both for sender (client), and for maintainer (server). For people who want send their changes via email there is "git rebase" (or "git pull --rebase") tool to replay your own changes on top of current upstream version, so your changes are on top of current version (are fresh), and "git format-patch" to create email with commit message (and authorship), change in the form of (extended) unified diff format (plus diffstat for easier review). Maintainer can turn such email directly into commit preserving all information (including commit message) using "git am".
CVS offer no such tools: you can use "cvs diff" / "cvs rdiff" to generate changes, and use GNU patch to apply changes, but as far as I know there is no way to automate applying commit message. CVS was meant to be used in client <-> server fashion...
lieutenant. If you are maintainer of separate part of a project (subsystem), or if development of your project follows "network of trust" workflow used in development of Linux kernel... or just if you have your own public repository, and the changes you want to publish are too large to send via email as patch series, you can send pull request to (main) maintainer of project.
This is solution specific to distributed version control systems, so of course CVS doesn't support such way of collaboration. There is even a tool called "git request-pull" which help to prepare email to send to maintainer with request to pull from your repository. Thanks to "git bundle" you can use this mechanism even without having public repository, by sending bundle of changes via email or sneakernet. Some of Git hosting sites like GitHub have support for notifying that somebody is working (published some work) on your project (provided that he/she uses the same Git hosting site), and for PM-ing a kind of pull request.
main developer, i.e. somebody who directly publish his/her changes (to main/canonical repository). This category is broader for distributed version control systems, as having multiple developers with write access to central repository is not only possible workflow (you can have single maintainer who pushes changes to canonical repository, a set of lieutenants/subsystem maintainers from which he/she pulls, and broad range of leaf developers who send patches via mail either to maintainer/project mailing list, or to one of lieutenants/submaintainers).
With Git you have choice of using SSH protocol (git protocol wrapped in SSH) to publish changes, with tools such as "git shell" (to help security, limiting access of shell accounts) or Gitosis (to manage access without requiring separate shell accounts), and HTTPS with WebDAV, with ordinary HTTP authentication.
With CVS there is a choice between custom unencrypted (plain text) pserver protocol, or using remote shell (where you really should use SSH) to publish your changes, which for centralized version control system means committing your changes (creating commits). Well, you can also tunnel 'pserver' protocol using SSH, and there are thir party tools automating this... but I don't think this is as easy as e.g. Gitosis.
In general distributed version control systems, such as Git, provide much wider selection of possible workflows. With centralized version control systems, such as CVS, by necessity you have to distinguish between people with commit access to repository, and those without... and CVS doesn't offer any tools to help with accepting contributions (via patches) from people without commit access.
Karl Fogel in Producing Open Source Software in section about version control states that it is better to not provide too strict, rigid and rigorous controls on areas where one is allowed to make changes to public repository; it is much better to rely (for this) on social restrictions (such as code review) than on technical restrictions; distributed version control systems reduce that IMHO even further...
HTH (Hope That Helps)