These last weeks have found our team putting together the front and backend components of the Gin prototype. Our efforts have centred on establishing a single repository on Github, currently administered by Michelle, where Vincent and Leemor’s frontend code has been integrated. We now have a platform for future developers to start from.

For a future UCOSP team getting started with this project I would recommend establishing a work flow with github as soon as possible. We have found that an effective model involves using a single repository to avoid drift between team member’s work that can grow into headaches when it comes time to merge code. A live voice or text chat open across team members during planned work sessions is also valuable in addition to weekly meetings and an actively used mailing list.

Hosting on a public repository such as Github will be essential as this project matures into a proper open source endeavour. Remember that this project is designed to be adopted by anybody; it is vital that all work be maintained in one public location.

To get started should be a matter of installing Django and cloning the Gin repository. Keep in mind that we have not been publishing all essential files for a complete Django project! The most glaring omissions to resolve before running the server from the repo will be settings.py and manage.py (For a complete list view the .gitignore file.) The best way to generate these is to create a new Django project and copy over what’s missing as a template.

You will also need a database, which we do not include on github. These can be initialized from the project’s schema using django’s built-in utilities. I recommend creating a seed file to distribute among the team for the sake of testing or demonstrating features. There is a shell script in the github repo to assist with this.

Some other tools we have found useful are ArgoUML and Balsamique to share visualizations of project goals. A UML diagram for the backend API, for example, is a lifesaver. Above all, I highly recommend working through a Git and Django tutorial:



A solid platform on which a range of data collection and retrieval strategies remains open will continue to inspire contributors as this project matures. Following VC news aggregators is a great source of inspiration for social media analysis because many startups and social media ventures tend to encounter similar issues we have with planning ahead for scale (cf. http://www.slideshare.net/dibau_naum_h/large-scale-processing-with-django) and deciding on which database technologies are best suited for our purpose. This recent blog post discusses using Google chrome’s open source n-gram model based language detection feature, with python bindings: http://blog.mikemccandless.com/2011/10/language-detection-with-googles-compact.html and don’t forget to follow Google’s chart API (http://code.google.com/apis/chart/interactive/docs/more_charts.html) to remember what interesting visualizations can be integrated with your working Gin dataset with some careful planning.

Good luck!