Bookshelf: Machine Learning and Data Science

With less options in your sparetime in the current situation, we can invest some of the extra time we have gained in reading instead watching (too much) Netflix and other streaming services. Here some recommendations from my booskhelf. Machine Learning and the related topics like Data Science are the hot topics in IT in almost any business domain context and a basic understanding is important for everyone dealing with software, be it sales reps or product managers with little or no computer science background, and the last time you touched discrete mathematics, linear algebra or probability was at high school. The challenge here, both topics have a steep learning curve and it is hard to acquire a decent knowledge, but for most of us, basic understanding is good enough to manoeuvre discussions and appreciate what is possible and what not (now). There is a huge choice of online courses and books, though the majority is most likely to deep-dive or covering a very specific aspect of machine learning. Publisher like Springer and Packt even allow you download selected titles these days or permanently for free.
Here 2 titles that scrape the surface but give you some insights and overview of commonly used terms. Both available as paperbacks or for the Kindle for less than Euro 15,-.

Machine Learning For Absolute Beginners

by Oliver Theobald, Independently published, 2018, 155 pages

If you start from ground zero with no knowlegde, this is a good overview at 10.000 feet without diving too much into formulars and algorithms. You learn about linear and logistic regression, decision trees, clustering and more but just enough to get a basic understanding. Neural networks are covered too, but it might be challenging to transport this through only 10 pages. There is a bit of Python coding sprinkled in if you feel getting more hands-on.

Data Science

by John Kelleher, The MIT Press Essential Knowledge, 2018, 280 pages

This book covers the basics of data science, ranging from the definition of data types and the DIKW pyramid to standard tasks and outlining the whole data science steps in the CRISP-DM process. Non-technical aspects like ethics and privacy are covered too. This book is 99% free of algorithm and sourcecode.

The MIT Press Essential Knowledge series offers a few condensed titles, like Deep Learning, MetaData, Computational Thinking and others.

I will recommend some more titles soon. Stay safe and tuned.

The Forgotten Sourcecode

I remember the first time I heard the term Public Domain software and Shareware somewhere between the late 1980′ and early 90’s. Towards the end of the Commodore C64 era, where software was almost solely commercial (and not affordable for the average secondary student) creating the vivid software “sharing platform” at the schoolyard as a solid first release of software piracy, I got my hands on my first IBM compatible PC running DOS. Soon after data CD-ROM’s appeared with Shareware, a legal way to use software. Magazines were published with CD-ROM’s attached and I remember regular visits to shareware shops selling nothing but legal CD-ROM’s, years before the internet was available to public. While Public Domain Software was totally free of any license and Shareware was more like a free-to-use model (sometimes under certain conditions or restrictions similar to today’s lite/free versions), it laid the foundation of what we know as Open Source today, in my opinion one of the most important elements of our software landscape. I recommend the title “The Cathedral and The Bazaar” by Eric S. Raymond, the 25 year old book describing the inner parts of open source, a lot of it still applies.

bazaar

The Cathedral and The Bazaar – 1999 Book by Eric S. Raymond

If you are keen to go on a time travel you can download the ISO image of a couple of these shareware CD’s from archive.org and have a hands-on session with 25 year old software, though I doubt you can execute all of them on current hardware and OS.

The640SharewareStudio

Shareware CD anno 1992 (archive.org)

Once the internet was in place platforms emerged where hobbyists could store the software repositories and releases of their software. One of the early ones I remember was Sourceforge which was launched in 1999, it still exists today (after changing ownership 3 times last few years). Few others were coming and going in the same space (BerliOS, Launchpad, java.net, Javaforge, Tigris.org, ..). Though not so dominating today anymore, due to number of alternatives, one of the most prominent is Github, Sourceforge still hosts a huge number of software, some of them quite prominent, it also was the starting platform of some rather known solutions (Pentaho, Firebird, Wireshark, Nagios, Notepad++,..). Over the years I created accounts for some of the platforms, and even forgot some of them, now I solely use Github. Recently I came across a simple tool that I created in 2008 to experiment with repositories in Java and noticed the tool is still there and it was downloaded over 2.300 times in the last nine years. Not that the tool does anything more magic than creating UUID’s and copy them to clipboard. It is just amazing to see, as long the platform does not disappear, the code lives on, no expiry attached.

2017-11-20 14_56_57-Download Statistics_ All Files

Do you have some old forgotten software treasures too?

2017-11-20 15_05_52-UUID Generator download _ SourceForge.net

MOOC – E-Learning on Steroids

Hardly any industry is moving as fast as the IT industry. While your operational experience and knowledge of the vertical domain you are working in, is growing naturally along your career, it is not the same for IT. For example the airport environment, the underlying basics and physics of handling aircrafts, planning flights, etc. are exposed to changes, innovations and challenges, look at the A-CDM program, it took quite some years to take off and become main-stream, that is a much slower than any new general IT technology or platform soaring. Though this industry is picking up speed too and the boundaries between the digital and physical world start to blur more and more, airports are running digital transformation programs, though passengers still flying in the physical world.

But on the IT side of things, the speed is way beyond breath-taking and it is hard to keep a minimal overview over many areas of IT concerns as well dive into specific topics. How to stay up-to-date and tune into relevant topics ? Books (ink and electronic versions) and forums are certainly the traditional approach, on top of that you join conferences and in-persons seminars and training (which comes at a cost and time spent).

Since the 2000’s online courses came into the picture, as the successor for e-learning, and allow a much bigger audience to learn new technologies, skills and more. The very positive part, there are lots and lots of free courses, most platforms offer free and commercial courses, sometimes free to participate and only charge a fee if you want to get an official certificate (one can argue about the value of such certs) but most important, you can learn and move forward and update your knowledge with the click of a button.

The big challenge though is to identify what you need or interested in, find the right courses and, most important, manage your time. Using your spare time you have to choose wisely, you can’t run for every course out there, even they are so many you are interested in and you are temped to sign up for a dozen of courses, only not to finish any of them.

Todays key-/buzzword for this is MOOC or Massive Open Online Courses. This is like e-learning on steroids, in the past you had to look at dull corporate slides pretty much by yourself, now we look at videos, reading material online and offline, interactions with the organizer, mentor, trainer or your virtual peers at various levels.

Not only the organizations that started online learning, like schools and universities, are into the game, as well companies operating specific online course platforms and now book publishing companies offering courses and finally professional social platforms like LinkedIn.

I attended online courses at Coursera and Udacity, which offer a broad range of topics, and now started with some specific courses on HCI and UX at Interaction Design Foundation which solely offers courses on UX, HCI, Visualization and related topics. Though the courses are unattended (except the rating of your text answers or comments) but repeating, you still have a motivation to participate and go though the lessons because you pay money and they help you pacing the whole course by releasing the lesson packages over time.

Stay tuned for the results.

d3.js – Available Books

D3 is my favourite visualization platform, though the learning curve is steeper because it is about selections, data mapping and transformation close to the DOM. D3 does not come with pre-defined visualizations like bar and piecharts. The website comes with lots of samples and tutorials are available as well. If you take the time to walk through them and experiment by yourself you will learn most. Still I enjoy reading books about technical topics with an end to end walk-through.

Currently there are 2 books about D3 both from O’Reilly and both have a similar introductory focus.

Getting Started with D3

d3a June 2012, 12.99 U$ (ebook)

The books does what its title promises, getting you started, It jumps right into D3 with sample applications and code. What I really like is the fact the author connects the visualizations to real life data (New York’s MTA transportation data) which makes the whole book more entertaining and tangible. It also provides a chapter about transition and interaction, even about layouts which make more exciting visualizations, like those we all know from the D3 websites sample page. Though it does not go into advanced details. At this reasonable price I would recommend the title.

Interactive Data Visualization for the Web

d3b

November 2012, 23.99 US (ebook)

This book is a bit more comprehensive than the first one, it starts with some more basic underlying technologies and provides the reader with an introduction to HTML, DOM, CSS and Javascripts. The chapters covering D3 are written lengthier providing slightly more details. It runs along the sample around a bar-charts and scatter-plots which turns dull after a while.  The early release I have seems to be incomplete, so I dont want to give a final verdict.

With D3 obviously getting more popular we will certainly see more books, hopefully covering advanced features and more visualization centric. I was asked if I like to write one but my D3 knowledge is way not comprehensive enough, I wish Mike Bostock would write one.

Post number 300 ! Thanks to the up to 1000 visitors a day.

My Review of Java Message Service

Originally submitted at O’Reilly

 

 

Java Message Service, Second Edition, is a thorough introduction to the standard API that supports “messaging” — the software-to-software exchange of crucial data among network computers. With this practical guide, you’ll learn how JMS can help you solve many architectura…

Old Topic, but a good starting point.

By AnotherJavaDude from Singapore on 10/29/2010
4out of 5

Pros: Well-written, Helpful examples, Accurate

Best Uses: Intermediate

Describe Yourself: Maker, Developer

Despite JMS being an “old” (in IT terms) piece of technology (API), you still find requirements for it in lots of projects and the basic needs of a messaging API are fully satisfied, even not updated since 2002.
This book is one of the few books covering the topic at all, and coming with code samples is the best you can get to get started with JMS. With a fair background of your application/messaging server and your favourite IDE you should be able to get the samples running.

(legalese)

New Book on Glassfish Security available

Getting involved more into security requirements of real life production setups running Glassfish, I searched  the web for this topic. So far there was no book available focusing on security concerns, this just changed a few days ago..
I just get my hands on the new book by Masoud Kalali, Glassfish Security by Packt Publishing. Find more info here. Just started reviewing it, will update you soon.