ADF&G Division of Sport Fisheries Introduction to Git

Author

Adam Reimer

Published

July 22, 2023

Preface

This short book is intended to help staff within the Division of Sport Fisheries learn the use of Git and GitHub as a reproducible research best practice. Git provides ADF&G staff who write computer code with three main tools: 1) It provides a way to keep track of previous versions of a file or set of files without having to rename the file and retain multiple closely related versions, 2) it provides a way to make sure your analysis is discoverable to fellow researchers without requiring your direct involvement, and 3) it allows structured and documented collaboration between peers. These techniques are new but quickly becoming an industry standard.

Git has more features than we need for fisheries research with most of the educational material out there being written by and for professional programmers. This book attempts to close that gap by demonstrating the most common features of a git workflow in more accessible language. Best practice documents tend to describe appropriate practices for the largest, most complex analyses you are likely to face. If your project is simple, the overhead associated with some of these techniques may not be justified. Use your professional judgment to discern how to follow the spirit of reproducible research while modifying this guidance to specific situations you encounter. There is an appendix describing the minimum requirements for ADF&G/DFG analysis.

The book is structured as follows.

  • Chapter 1 describes what Git is, how it can help our research projects, and conventions used within this book.

  • Chapter 2 describes the process and commands necessary to use Git to track changes within your working directory.

  • Chapter 3 describes using Git and GitHub to store your work on GitHub. In the chapter collaborators can take several meaning: you can collaborate with yourself by using GitHub to allow you modify your analysis from two locations, your can collaborate with a peer to improve each others analysis, and/or you can collaborate with your professional predecessor and descendants by keeping your analysis discoverable.

  • Chapter 4 introduces some higher overhead concepts that can be very helpful on large, complicated analysis which require significant model development.

  • Chapter 5 shows you how to retrieve previous versions of a file (or the entire analysis) from your Git repository.

  • Chapter 6 reminds us all to leave our analyses in a good place for the next person and/or our future self.

  • Appendix A describes the RStudio Git graphical user interface (GUI).

  • Appendix B synthesizes the concepts in this book into a minimum reproducible research standard for ADF&G DSF Biometricians.