Rough Draft Done!

I just finished the rough draft of my book. All 11 chapters. That's right, eleven chapters (not the planned ten). After 10 months of working on this book, I finally have a rough draft done. Also, ten months is a long time and the software (Tensorflow) has changed a bit. This means I will have to go back and update the chapters, as some functionality is changing or being depreciated. But that's what you get for writing a book about software that is still technically in beta.

I must say, some chapters were much harder than others. I should have planned for this in my time-lines. Because of this, I fell behind schedule around chapter 8. Chapter 8 and 9 were technically the hardest and required much of my time pouring through documentation and debugging.

For a few examples in the book, I had to accept that the official documentation (and other tutorials) have better and more in-depth explanations. But the publishing company really wanted me to cover those examples. I had to make some decisions on how to do this. Do I just repeat what is out there? Do I try to cover it in more depth? Do I skirt around the issue and reference better work? After some debate, I came to the following conclusion.

I will not reiterate official tutorials and documentation where it is explained better. I don't want someone to pay for a book that has information in it that is free elsewhere.

There was only one section that I reiterated code from the official software tutorials. I decided to do this because the official tutorial was really lacking on code explanations and felt kinda hand-wavy. Because of this, I referenced the official tutorial and told the readers that we will instead explore the tutorial much more in depth. For reference, this is the 'deep-dream' tutorial in chapter 8.

For another section, I decided to concentrate on preparing a different type of dataset for usage in the official tutorials. I felt this was a great way to do this because even though the official tutorials used a canned dataset, their methodology was sound.

I have been updating the github repository here: https://github.com/nfmcclure/tensorflow_cookbook. You can find the python scripts and about 50% of the scripts have accompanying documentation. Over the next month, I will be adding more documentation as well as accompanying Jupyter notebooks. Also, I'm told the editing of the book starts soon.

Overall, my journey on writing a book (so far) has been a huge learning experience and an even bigger time drain. I have a lot more respect for authors as well. I'm not sure that I will be so eager to write a book next time either.

Soon, I can concentrate on being social, active, and posting about something other than this time consuming book.

Posted in Uncategorized | 7 Comments

I Started Writing a Technical Book... (part 3)

I'm over halfway done with the book I've been writing. I just submitted chapter 5. I've written all the code for chapter 6 and am currently writing up the explanations and walk-throughs now for that chapter. A few weeks ago, I got permission to put up the code for the book on my github here:

https://github.com/nfmcclure/tensorflow_cookbook .

It seemed kind of flat to just put up the python scripts for each recipe, so I'm trying to add documentation with graphs to explain why and what each chapter and section is about. I don't want to overstep my book and provide all information on the github, but I do want to provide enough so that the scripts are readable and show that they have a point to understanding Tensorflow and the machine learning algorithms.

Also, a few weeks ago, my editor suggested a cover picture for the book. I noticed that the style of cover pictures for this publishing company for technical books is just a really nice random picture. Most technical books don't really have a point to the cover pictures, in fact, 'O'Reilly' is famous for having random pencil drawings of animals on their technical books. Here is the cover for my book:

B05480_MockupCover_Normal_New

Also, as a really interesting side note, my publisher offered to enroll the book into a new program that they offer. The book is available NOW, for full price, and you can get updates as I write it. If you are interested in that deal, the link is here:

https://www.packtpub.com/big-data-and-business-intelligence/tensorflow-machine-learning-cookbook

The writing has been going well. Writing the chapter on support vector machines was much harder than I expected. While I studied SVMs in school, actually writing code and understanding how to implement different kernels via matrix multiplication in Tensorflow code was quite frustrating, but very rewarding when it finally worked.

I'm slightly worried about the rest of the book, as it is on a fairly new subject to me and I'm currently only one chapter ahead of the deadlines.

Back to writing...

Posted in Uncategorized | 6 Comments

I Started Writing a Technical Book... (part 2)

A lot has happened since I first decided to write a book. I quit my job, accepted a new job, and am currently taking about 6 weeks off to get a jump start on writing the hardest parts of the book... the first few chapters.

The job change was partly due to the fact that I wanted time off for the writing and partly because I've had an itch to work for a specific company in Seattle for a while and they had a job opening available that fit very well.

So I went in for an interview and it turned out I really liked the people and the problems they are trying to solve. I've always liked my prior job, but this new company is tackling an issue that I'm very passionate about.

As for the book, it's taking a bit longer than expected. I thought I could get about a chapter a week done during this break. But in the first week, I had hard time adjusting to my unscheduled time. I can now run errands when ever I feel, take breaks for as long as I want, and get easily distracted by whatever social media I happen across. I now have a huge respect for people that consistently work from home.

I've started disciplining myself with my allocation of time. It's easy to do when I'm in an office away from home, but much harder to do when I'm sitting in my living room, typing on my laptop with my bed, dishwasher, and washing machine feet away from me, just asking for chores and naps to be done.

I will finish the first chapter by the end of this week and hopefully finish the second chapter by the end of the month. Wish me luck.

Posted in Uncategorized | Leave a comment

I Started Writing a Technical Book... (part 1)

I decided to write a book.

Yikes. Just writing that now seems scary. You are making something pretty permanent that will forever reflect on how people view you. Anyone, anywhere (hopefully) will be able to pick up your book and instantly come to an opinion on you as a person. Yikes.

For some background, I tend to jump headfirst into projects. I'm not the person to say no to much. Currently, I work full time as a 'Data Scientist' and teach a night class on 'Programming Statistical Methods' for a non-credit certification course through a local university.

I like challenges. This is probably why I like exercising, programming, and learning as much as I can about statistics and machine learning.

So last month, when a publishing company approached me and said, “Do you want to write a book about topic X?”, I gave it some serious thought. The topic was a machine learning library that Google released last November.

When Google released this library, called Tensorflow, it made big waves through the machine learning community. I, like many others, went through the tutorials and tried it out for a bit. It seemed very promising. Then life took over and I haven't had much free time to use it much since.

The beginning of this year, I made a goal- I wanted to create a series of blog posts on Tensorflow and convince people how great it is, which would also force me to actually get real hands on experience with it. Another reason is that a significant subpopulation of the machine learning community immediately dismissed Tensorflow because it was slightly slower than the alternatives. While I know that speed and efficiency is very important, advances in computation and hardware are accelerating quickly. What can be done now in 6 hours will soon be able to be done in 6 minutes. So Tensorflow isn't that far behind. BUT, and this is a BIG BUT, Tensorflow has achieved portability. Any object that can run C/Python code, can run Tensorflow with barely any code changes. This is momentus. Just to give you perspective, your smartphone can probably run C and Python code. So if a company created an amazing machine learning program, they can deploy it to your phone with very little effort. No other machine learning platform can do that currently. In fact, to deploy some of the current deep learning software, you need access to random compilers and programming languages that don't come standard on any machine.

The great thing about blog posts, is that you can go at your own speed and no one really has to notice if they don't want to. I was reading more and gearing myself up for this, when the publishing company sent me an email wanting me to write a shorter technical book on Tensorflow.

The first thing I did was to do some research to see what people said about the publisher and their experience writing a technical book. The overwhelming opinion was that it is not worth the trouble financially and time wise. Well, I wasn't looking for a monetary payday with the book, but the time issue troubled me. I already have many personal projects I wanted to work on and haven't made the time.

The deciding factor was that I will never have time. If I don't have someone (like a publisher/editor) to hold me accountable, the blog posts on the topic will take a while and not help as many people. I figured I could accomplish two things with this book: (1) help people learn about this technical topic, and (2) learn a lot myself in the process.

Last week, I agreed over email to start, with the caveat to extent the time tables for publishing (being as I have to learn a lot before hand as well). Then the publisher sent me a time table that includes sending them a rough table of contents by the end of the month. Today, was my first ability to sit down for multiple hours and do some research and start to flesh out the outline of the book.

I'm glad I started early, somehow in my head, I thought the outline/table of contents would be the easy part. Only by starting this did I realize how tricky it can be. Should topic X be a chapter or subchapter? What is the logical flow of the chapters? Do some chapters depend on others? After a few hours work, I've only got a rough outline for the first two chapters, and potential topics for the remainder of the book. I have a feeling this is going to be a rough year.

Posted in Uncategorized | Leave a comment

Trump + The Scream = ...

I was inspired by the recent republican debate to use neural networks to merge a picture of Donald Trump with the style from "The Scream" by Edvard Munch. I modified a tensorflow neural style algorithm to output the results continuously and made every 10 generations into an animated GIF:
trump_the_scream_small

* See my github for a fork/modification of the neural style algorithm in Tensorflow that I modified to output images during the iterations:

Neural style in Tensorflow

Also, please keep comments non-political. Let's keep this to how amazing machine learning / deep learning has come.

Posted in deep learning, Visualization | Tagged , , | Leave a comment

ImageNet CNN Architecture Image

I'm getting really tired of this classic CNN ImageNet paper architecture image. Here is the original article from 2012: Imagenet Classification with Deep Convolutional Neural Networks You know the one, with the top 25% of the image is cutoff? I decided enough was enough and I wanted to fix it. How hard can that be? I loaded up inkscape and finished the boxes. Yes, it's not perfect, but it'll work. See below.
CNN_Fixed3

Posted in deep learning | Tagged , , , | Leave a comment

Displaying Digits of Pi on Raspberry Pi in Python

(Sorry for the long haitus of posts, I've been dealing with some health issues that appear to be going away finally.)

NOW.....

One of the most ePIc Pi-days is here today!  3-14-15!!!   What better way to celebrate this Pi-Day than to program your Raspberry Pi in Python to display digits of Pi.

We are going to accomplish this through 4 LEDs which together will display the binary representation of the decimal digits of Pi.

Here is the key for the 4 LEDs to display digits:  (1 for on, 0 for off)

(led 1, led 2, led 3, led 4)

0 = (0, 0, 0, 0)

1 = (1, 0, 0, 0)

2 = (0, 1, 0, 0)

3 = (1, 1, 0, 0)

4 = (0, 0, 1, 0)

5 = (1, 0, 1, 0)

6 = (0, 1, 1, 0)

7 = (1, 1, 1, 0)

8 = (0, 0, 0, 1)

9 = (1, 0, 0, 1)

We want to display in order 3 then 1 then 4 then 1 then 5 ....

Here is the photo of the completed project:

RaspPi_PiLEDs1

Here is the wiring set up:

RaspPi_WiringDiagHere is the python code:

 


# Script to display the digits of pi in binary on leds
import os
import time
import RPi.GPIO as GPIO
GPIO.setmode(GPIO.BOARD)
GPIO.setwarnings(False)
GPIO.setup(7,GPIO.OUT)
GPIO.setup(11,GPIO.OUT)
GPIO.setup(13,GPIO.OUT)
GPIO.setup(15,GPIO.OUT)
number_seq = 31415926535897932384
number_list = list(str(number_seq))
for n in number_list:
if n == '0':
time.sleep(1)
if n == '1':
GPIO.output(7, GPIO.HIGH)
time.sleep(1)
GPIO.output(7, GPIO.LOW)
if n == '2':
GPIO.output(11, GPIO.HIGH)
time.sleep(1)
GPIO.output(11, GPIO.LOW)
if n == '3':
GPIO.output(7, GPIO.HIGH)
GPIO.output(11, GPIO.HIGH)
time.sleep(1)
GPIO.output(7, GPIO.LOW)
GPIO.output(11, GPIO.LOW)
if n == '4':
GPIO.output(13, GPIO.HIGH)
time.sleep(1)
GPIO.output(13, GPIO.LOW)
if n == '5':
GPIO.output(7, GPIO.HIGH)
GPIO.output(13, GPIO.HIGH)
time.sleep(1)
GPIO.output(7, GPIO.LOW)
GPIO.output(13, GPIO.LOW)
if n == '6':
GPIO.output(11, GPIO.HIGH)
GPIO.output(13, GPIO.HIGH)
time.sleep(1)
GPIO.output(11, GPIO.LOW)
GPIO.output(13, GPIO.LOW)
if n == '7':
GPIO.output(7, GPIO.HIGH)
GPIO.output(11, GPIO.HIGH)
GPIO.output(13, GPIO.HIGH)
time.sleep(1)
GPIO.output(7, GPIO.LOW)
GPIO.output(11, GPIO.LOW)
GPIO.output(13, GPIO.LOW)
if n == '8':
GPIO.output(15, GPIO.HIGH)
time.sleep(1)
GPIO.output(15, GPIO.LOW)
if n == '9':
GPIO.output(7, GPIO.HIGH)
GPIO.output(15, GPIO.HIGH)
time.sleep(1)
GPIO.output(7, GPIO.LOW)
GPIO.output(15, GPIO.LOW)
GPIO.cleanup()

Yay!!!!

HAPPY PI DAY!!!

Posted in Uncategorized | Leave a comment

Random Buffy the Vampire Slayer Episode Generator

Let me first say that I'm a fan of everything Joss Whedon.  (EDIT: OOF, sorry somethings just don't age well. It's really tough to be a fan of something and later find out that that author/producer/artist/actor is a bad person. I have fond memories of watching Buffy, Angel, Dollhouse, and a few other things made by that person. But I won't continue supporting a bad person who has done bad things.)

So when I was asked last year to create a 'Random Episode Generator' for Buffy, I jumped at the chance.  I also think this is a great way to illustrate the usefulness of Excel.  That's right, I just wrote the terms 'useful' and 'Excel' in the same sentence.  I think Excel is vastly underrated in the nerdy-data community and is very helpful.  Again, certain things it does very well, and certain (many) things is does horribly.

But this project is perfect because of it's simplicity.  Excel can do this in a heartbeat and look halfway decent as well.  Here is a screenshot of the spreadsheet:

BuffyGeneratorScreenShotAll you have to do is hit 'F9' (windows) or 'Command + =' (mac) and it will pick a random episode, and tell you which season, disk and episode number to find it.  I've attached it here.

BuffyEpisodeGenerator

Posted in excel, Visualization | Tagged | Leave a comment

Uber Taxi Bar Chart Fail

I thought I would share something I found in an advertisement/marketing email from Uber Taxi.  See anything amiss about the bar charts?

Uber_Taxi_FailLet's hope Uber doesn't charge rates like they scale their bar charts.  Could be bad for business.

Posted in analysis, data, Visualization | Tagged , , | Leave a comment

The Hobbies of the Scripps 2014 Spelling Bee Contestants

I downloaded text data from  http://public.spellingbee.com/public/results/2014/round_results  and performed some text mining on the contestant interviews and came up with some interesting results.  The hobby with the highest round average was Volunteering followed by Chess and Movies.  I filtered for hobbies that occurred in at least 10 contestants and stopped at round 13.  The reason that I stopped at round 13 was because there was a tie this year and the top two contestants went many rounds after 13.  If I kept all the rounds in, the top hobby would be playing the Oboe, as both winners played the oboe.

Anyways, here are the results:

SpellingBeeInfographic

As a side note, I'll post the two R scripts to my git soon.   (One script for web scraping, one script for analysis).

Posted in analysis, data, R | Tagged , , , | Leave a comment