4 data science examples after 1 month studying

Wanting to stay true to openness and the sharing of our experience as we work through the program, here are 4 examples of what we have worked on while pursuing our online data science masters. These examples are not necessarily of outstanding data science, but demonstrate concepts which are required within data science; as well as what you might be able to achieve after 1 month studying our first 5 courses.

Harvard Visualisation Course – CS171

Example 1

Language: Javascript    Libraries: D3.js    Course: Harvard Visualisation (CS171)

D3 Network Graph

Link and Node Diagram GitHub Network D3

Table_gif

Sortable, Highlightable, Table & Bar Chart D3

Description: These are two examples of creating interactive visualisations which run in the browser using D3. Although at this stage they are not aesthetically pleasing, they nonetheless demonstrate the power of creating interactive browser graphics. The first graphic shows Github commit branches: alternating between a temporal view and sequential spacing. The second shows a table which allows selection, comparison and filtering of the data.

Programming Abstractions – CS106B

Example 2
 
Language: C++    Libraries: Stanford Lib    Course: Stanford Programming Abstractions (CS106B)

Chaos Game

Chaos Game C++ Program

Description: The user is asked to select three random points on the screen, called A, B and C; the picture is then drawn by repeating the following steps:

1) Select one of A, B or C at random, 2) draw a small circle around it, 3) move half-way between that vertex and either one of the two remaining point and draw a circle at that halfway point. (repeat 1-3)

The same picture is drawn everytime 🙂 which is very neat !

We normally work from terminal and whenever possible aim to use C++ standard libraries to avoid running into issues with the Stanford libraries – for this particular example we were forced to use the Stanford library so we provide the code to compile this program from terminal using the Stanford library folder which can be downloaded on their website. Our code for implementation of the above, as well as other C++ course exercises is available here.

To compile the file in terminal:

 g++ filename StanfordCPPLib/*.cpp -lpthread 

where StanfordCPPLib is the name of the directory that contains all .cpp library files.

Example 3

This next example was incredibly fun to work on, as it randomly generates text based on the sequence of characters already seen so far; it was particularly exciting to be working on this in C++ because we had just covered Markov models in python the week before in the Natural Language Processing course from Columbia University. Here we chose to illustrate a random text generator based on an order 6 Markov model which means that each new character is generated based on the previous 6 characters that precede it; and an order 2 Markov model which only considers the preceding two characters to generate random text.

Original text (from Rocky Balboa):

But somewhere along the line, you changed. You stopped being you. You let people stick a finger in your face and tell you you’re no good. And when things got hard, you started looking for something to blame, like a big shadow. Let me tell you something you already know. The world ain’t all sunshine and rainbows. It’s a very mean and nasty place and I don’t care how tough you are it will beat you to your knees and keep you there permanently if you let it. You, me, or nobody is gonna hit as hard as life. But it ain’t about how hard you hit. It’s about how hard you can get hit and keep moving forward. How much you can take and keep moving forward. That’s how winning is done! Now if you know what you’re worth then go out and get what you’re worth. But ya gotta be willing to take the hits, and not pointing fingers saying you ain’t where you wanna be because of him, or her, or anybody! Cowards do that and that ain’t you! You’re better than that!

Random text – Markov model 6 (all words are English !! and text resembles original style):

ain’t all sunshine and that ain’t where you wanna be because of him, or her, or anybody! Cowards do that ain’t where along the line, you can take the hits, and not pointing fingers saying you already know. The world ain’t about how hard you hit. It’s about how hard as life. But ya gotta be willing to blame, like a big shadow. Let me tell you started looking forward. How much you can get hit and keep you to your face and I don’t care how tough you can take and rainbows. It’s a very mean and nasty place and keep you there permanently if you let it. You, me, or nobody is gonna hit and tell you you’re no good. And when things got hard, you started looking forward. How much you already know. The world ain’t you! You’re worth then go out and tell you started looking forward. How much you changed. You stopped being you. You stopped being you ain’t you! You’re worth. But ya gotta be willing to take the hits, and nasty place and keep moving forward. That’s how winning is done! Now if you let people stick a fingers saying you ain’t all sunshine and keep you there permanently if you know what you’re worth then go out and get what you’re worth then go out and that ain’t where along the line, you something you. You let it. You, me, or nobody is gonna hit and tell you you’re better than that!

Random text – Markov model 2 (what do you think ?):

ou! Cow meand good. But hat’s and know how where notta being if you sty movinnin’t ife. Let hare an gereand and na for fing it. Lethare no gong forld know will you ta becand rain’t yout ple place will be will you stake tard keep movinget ward tards dow. But you card lin’t you thard nastake you an you hit. The loody moving it you anged. You he an gere aing thit an get yough you cause you and wor is do you lin toplame, of your hit yout poine hinbou leth then th. But a finna forwarthat’s an gone! Cow. Let you carted no ted keep movinen go tell be nobout will some, yout anently me, you and will somet poing take nobou’re wor forward. How hat!

Introduction to Databases – Coursera

Example 4

Language: SQL    Implementation: SQLite   Course: Introduction to Databases (Stanford via Coursera)

We thought we would leave the plainest example to last, (no picture this time unfortunately). Recently we have been learning exactly how important relational algebra and in particular SQL are for data science, particularly big data problems or scenarios where you might want to distribute computing power. Here is an old fashioned SQL query demonstrating conditional selections, relational algebra joins and aggregate functions (bingo you have a table selected, riveting I know!).

SELECT name
FROM (SELECT name, s1.rID, s2.mID, s2.stars, s2.ratingDate FROM Reviewer s1 INNER JOIN Rating s2 ON s1.rID = s2.rID) s3
GROUP BY s3.rID
HAVING COUNT(s3.stars)

 

We hope that in a month’s time we will have more exciting examples, but for now we really must get back to work!

Fraser and Sabine

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s