Showing posts with label Information Theory. Show all posts
Showing posts with label Information Theory. Show all posts

Saturday, July 27, 2013

Applied Networking Labs, Pearson Custom Business Resources, 1st Edition, Randy J Boyle


I purchased this book as the hands on experience for. Dr Boyle's Networking and Servers course at the University of Utah. The labs are excellent as they teach you basic knowledge of dozens of programs that are commonly used by network administrators and professionals. After doing most of the labs I have substantially added skills to my resume that have greatly increased my visibility to employers. I applied to five network administrator positions and each of them gave me a call, even though I have no work experience. My resume was impressive enough with the skills that these labs added. That is why this book is so valuable; it doesn't teach theory but applicable skills that employers want. I suggest this book to anyone who is interested in learning networking and servers from a basic level. The only downside is that because there are so many programs covered there isn't full in depth discussion on how to use each program. That being said, the book isn't designed to be a user manual for each program but to give general knowledge and skills thereby allowing users to discover and learn on their own with the programs they are interested in.

Dr. Boyle's textbooks have always been a great resource for learning hands on applications of IT Security and Networking skills. Knowledge learned from this book can be put on a resume and teach you how to talk-the-talk during interviews. Unfortunately, many think that just by doing some of the exercises in the book constitute putting those skills on their resume. This book will give you a great introduction, but you have to apply yourself with more outside learning to become proficient enough to get through a job interview!

This book is fantastic. The projects in it are explained thoroughly, and the content is really interesting. I enjoyed doing the projects a lot.

One of my favorite projects involved creating a bootable Linux distro and running the OS from a flash drive as the computer boots.

Product Details :
  • Paperback: 336 pages
  • Publisher: Prentice Hall; 1 edition (July 24, 2010)
  • Language: English
  • ISBN-10: 0132310341
  • ISBN-13: 978-0132310345
  • Product Dimensions: 0.5 x 8 x 11 inches

More Details about Applied Networking Labs, Pearson Custom Business Resources, 1st Edition

or

Download Applied Networking Labs, Pearson Custom Business Resources, 1st Edition PDF Ebook

Sunday, May 5, 2013

Head First Data Analysis: A Learner's Guide to Big Numbers, Statistics, and Good Decisions 1st edition, Michael Milton



This was a beautiful book that really refueled my interest for Statistics (which I've been struggling to start learning...even though I know calculus and LOVE mathematics)...but it really caught my eye because it goes into detail about the R statistical programming language.

The first few chapters get you going on a specific mindset of how to interpret data, which is VERY important to keep throughout the entire reading of this book.

After that groundwork is established, you are taken on a really cool journey of some Excel features (don't freak out...those of you who don't know excel proficiently will be fine in the hands of this book) that you never would've believed were there! You can even use Google Docs to do the same things if you don't have a valid copy of Excel!

Finally, R comes into play with all its glory...I would've loved for a deeper dive with this technology, but there are several other books out there in which you can get down and dirty with R (http://www.amazon.com/The-Art-Programming-Statistical-Software/dp/1593273843/ and http://www.amazon.com/Cookbook-OReilly-Cookbooks-Paul-Teetor/dp/0596809158/ are my favorites and I own them both on my kindle).

I hope that eliminates all your FUD's (Fears, Uncertainties, and Doubts)...go and grab this book RIGHT NOW! You'll be blown away with what you'll be able to do after you read everything here!

P.S. It only takes about a week and a half to get through it going at a nice, slow, and comfortable pace...if you're HUNGRY like I was, you can knock it out in about 4 days.

I got the book promptly. It has that softbound textbook feel. but good binding not cheap or ready to fall apart. the intormation in it so far seems interesting and well organized. its in the "head first format" which means there is a lot of nice visual lay out and side notes and some graphics to make understanding the concepts by seeing them when possible. I like that format. It is still pretty clean and gets to the point. but I have only read and used so much of it at this point so I cannot go much farther into the content than that. -- in short I think it is a solid book to get if one wants to better understand how to interpret social science numbers, or other scientific numbers that they are shown in a way that they are wise to various ways that data can be spiked and spiced. how in depth I cannot comment on as I have not fully digested the book. But it is a book that is designed to be both read and used as a topical reference. And it has the "Head First" style keeping things clean but providing insightful commentary, context and graphical illustrations where it might really speed up or enhance understanding of a particular idea or complicated example. it also uses bolding in areas where you can pay attention to the new vocabulary you might want to learn in order to lay the ground work for even more technical education in data analysis. it even has a chapter where it goes over some of the more obscure plug ins for excel that are there for helping a person analyze data. I would basically treat this book as a nice survey of both the human technical sides of data analysis. it also covers things like data collection or effective data presentation, and as I said it refers to several readily available tools like excel for example and how they can be used by someone who wanted to know how to leverage their computer in order tame and extract meaning from data they have been given to interpret. -- I think that its a useful primer that is like a survey course in the subject sans the professor. But how good each section is I cannot comment on as I have only started with the book for a several weeks. but what I did read I found completely intelligible and because I am not a total novice at looking at Data, there were times I could use its nice formatting to skip past explanations I did not need because I already was familiar with them. If I fall in love with the book I may come back and say so and make my stars 5 instead of a 4 but at this point I would highly recommend this book for anyone who wanted a nice primer that went into to a very serviceable level of detail for a primer or survey type information source.

This book is for professionals that must analyze data in their daily work. First off, if you are unfamiliar with the approach of the "Head First" series of books by O'Reilly, the approach was and is revolutionary in the field of technical writing. The authors of this series know that page after page of terse text will not easily penetrate the brain of the working professional who needs help rather quickly. Traditional textbook models work best on students in a traditional classroom setting who can slowly absorb material over a period of several months with the help of bi-weekly classroom sessions with a professor. The working professional does not have this luxury of time or of personal tutoring.

Thus the authors both penetrate your brain and hold your interest by serving information up in unusual ways - odd pictures and illustrations, Q&A sessions, repeating the same material in different ways, and interesting case studies in which you are asked at every step to give your input. They'll even lead you down the the wrong path every now and then so that you remember the right one all the better.

As for the subject matter, this is not a book on statistics and how to solve problems in statistics. Instead, it is how you use various statistical models and tools and visualization to analyze often confusing corporate data and come up with recommendations based on that data. Some mathematical methods will be presented as they are necessary to solving the underlying problems - optimization, hypothesis testing, bayesian statistics, subjective probabilities, heuristics, and histograms - these are all mentioned and even have their own chapters. However, this book is also about tools - R and the analysis tools of Excel specifically. In the appendix, this book even shows you how to install R.

However, I don't believe that you could get away with knowing nothing of statistics and really get the most out of this book. If you do happen to have the luxury of a little time I suggest the following. Read the excellent Head First Statistics as a tutorial, and then use the problems in Schaum's Outline of Statistics (Schaum's Outline Series) to test your knowledge. Then you should be more than ready for this book.

The author has a chapter entitled "leftovers" that tells you what this book does not cover. I include that here so that you don't waste your time if this is what you are looking for:

1 Everything else in statistics
2 Excel skills - (book assumes previous experience)
3 Edward Tufte and his principles of visualization
4 PivotTables
5 Nonlinear and multiple regression
7 Null-alternative hypothesis testing
8 Randomness
9 Google Docs

I highly recommend this book for the right audience with the right experience level.

First, a disclaimer: as one of the technical reviewers for the book, I might be a little biased. Having said that, I'm willing to bet my copy of Head First Data Analysis that this won't be the last 5-star review you'll find here :-)

By my count this is the 20th book in the Head First series, so by now most Amazon customers know the story behind the Head First format, style, and pedagogy. These aren't your typical technical books, so if this is the first Head First book you're considering, you owe it to yourself to get a sneak preview first. I think you'll be in for a treat.

The Amazon Reader does have the first six pages of Chapter 1, which will give you some idea, but I'd recommend going to Head First Labs where you can download and read the entire 2nd chapter. You can also grab the full Table Of Contents in PDF format, which I believe is a little easier on the eyes than the TOC in the Amazon Reader.

The book is written for folks without hardcore data analysis experience who are looking for an introduction to analyzing data to make better decisions. You won't need a background in statistics, engineering, or computer science. While some data analysis books assume you're a math geek, Michael Milton does not.

And while many "Data Analysis" books pretty much revolve around Excel's data analysis functions (Analysis ToolPak, Solver, etc), this book is more about how you work with data, not about how you use a particular software tool. While you do use spreadsheets and a statistical computing software package called "R", the focus is on using the tools between your ears to become a better data analyst.

These days almost everyone needs to deal with and interpret data. Those that become successful know how to make sense of it all. This book will help you think about, process, and present your data so you can draw reliable conclusions to real-life questions.

Product Details :
Paperback: 486 pages
Publisher: O'Reilly Media; 1 Original edition (August 4, 2009)
Language: English
ISBN-10: 0596153937
ISBN-13: 978-0596153939
Product Dimensions: 8 x 1 x 9.2 inches

More Details about Head First Data Analysis: A Learner's Guide to Big Numbers, Statistics, and Good Decisions 1st edition

or

Download Head First Data Analysis: A Learner's Guide to Big Numbers, Statistics, and Good Decisions 1st edition PDF Ebook

Thursday, April 25, 2013

MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems 1st edition, Donald Miner



In the 1990s O'Reilly books had a well-earned reputation for quality. O'Reilly authors such as Simson Garfinkel explained technical topics with precision, clarity, and wit. I proudly kept a whole shelf of O'Reilly books at work, and I imbibed copious java from their tenth anniversary mug. I'm sorry to see that O'Reilly's traditional quality has gone the way of the Internet bubble. MapReduce Design Patterns represents the absolute nadir of technical writing, and it never should have been published in its current form.

One of the most poorly written parts of the book is Appendix A on Bloom filters. As I was writing my original review of the book, I thought it might be helpful to point readers to a better explanation of the topic. Turning to Wikipedia as a potential reference, I was struck by the number of similarities between it and Appendix A. It now appears that this appendix plagiarizes the Wikipedia article "Bloom filter." To see this, compare the opening paragraph of the Wikipedia article (January 19, 2013) to the first two paragraphs of the book's appendix (which you can see in the sample pages here):

Wiki: A Bloom filter, conceived by Burton Howard Bloom in 1970, is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set. (Paragraph 1, sentence 1)

MRDP: Conceived by Burton Howard Bloom in 1970, a Bloom filter is a probabilistic data structure used to test whether a member is an element of a set. (Page 221, paragraph 1, sentence 1)

Wiki: False positive retrieval results are possible, but false negatives are not; i.e. a query returns either "inside set (may be wrong)" or "definitely not in set". (Paragraph 1, sentence 2)

MRDP: While false positives are possible, false negatives are not. This means the result of each test is either a definitive "no" or "maybe." You will never get a definitive "yes." (Page 221, paragraph 2, sentences 2 - 4)

Wiki: Elements can be added to the set, but not removed (though this can be addressed with a counting filter). (Paragraph 1, sentence 3)

MRDP: With a traditional Bloom filter, elements can be added to the set, but not removed. There are a number of Bloom filter implementations that address this limitation, such as a Counting Bloom Filter, but they typically require more memory. (Page 221, paragraph 2, sentences 5 and 6)

Wiki: The more elements that are added to the set, the larger the probability of false positives. (Paragraph 1, sentence 4)

MRDP: As more elements are added to the set, the probability of false positives increases. (Paragraph 2, sentence 7)

When confronted with examples like these, authors typically claim that the similarities are due to their unintentionally copying verbatim from their notes. While that may be true in some cases, it is the task of the publisher to see that problems like this are corrected before books are released. Clearly the authors and the editors at O'Reilly have failed to diagnose this problem and provide a timely appendectomy. The result is a book with a fatal case of appendicitis left to die a humiliating death in the marketplace.

Although MapReduce Design Patterns would have benefitted from an appendectomy, such an operation would have been insufficient to restore the book to good health. For much of the book suffers from a sort of write-once-copy-everywhere mentality that leads to dreadful writing and programming. A few choice examples should suffice to illustrate this point.

Until the book's penultimate chapter every example except two includes this pattern of statements:

"The following descriptions of each code section explain the solution to the problem.
Problem: ..."

Apparently it occurred to neither the authors nor the editors that it might be premature to refer to "the problem" and its solution before that problem had been stated. And certainly no one thought to ask whether or not the first sentence of the pattern clearly sets forth what's coming next in the book. Yet through the magic of the Ctrl-C, Ctrl-V sequence, this statement appears dozens of times throughout the book.

The first hint of an editorial hand finally appears at beginning of the Generating Data Examples section of Chapter 7, where at last we find the statement of a problem in paragraph form followed by our now familiar sentence. Unfortunately, the book's remaining four examples revert to the authors' text design pattern with an ungrammatical twist:

"The sections below with its corresponding code explain the following problem.
Problem: ..."

Perhaps a NullWritable object would have made a better editor.

Fortunately, not all of the book's wretched writing is as annoying as this. Some of it, such as this garbled thought from page 185, is hilarious:

"There is no implementation for any of the overridden methods, or for methods requiring return values return basic values."

Programmers may be amused by how the class MRDPUtils seems to appear and disappear randomly with the invocation of the method transformXmlToMap() in the book's code examples. They may also laugh at the erroneous comments in the source code on pages 20, 23, 26, and 29. Since the book's sample code contains the same errors, one might begin to wonder if anyone read or tested that code after it was written. Considering the map() method of the UserIdReputationEnrichmentMapper class given on page 165, that seems unlikely. An astute reader will easily see that this method emits the wrong key, and testing certainly would have revealed it. Since the map() method's actual output clearly contradicts the specification for the reducer implementation on the same page, the problem could have been spotted by a conscientious editor.

Almost two decades have passed since Simson Garfinkel typed "buy more O'Reilly books" in an example in one of his books. After reading MapReduce Design Patterns, I no longer agree with his recommendation. Readers who are interested in this topic will do well to look elsewhere for more information on the subject.

This book is a good catalog of the different patterns any big data solutions programmer should know in order to effectively perform their job. While the authors admit that writing some of these patterns in the context of a map/reduce job on Hadoop with tools like Pig available can be counterproductive they make the compelling argument that understanding these patterns is still important.

The technical examples in the book are sometimes missing blocks of code, which while easily derived may be a source of frustration for some readers. (I have my implementations of the exercises on github, under my username of cfeduke; I learn best by doing, so keying in and executing examples is paramount.)

I've had a moderate level of experience with Hadoop, from 0.18 to 1.x, before tackling this book. I felt that this book taught me a fair amount about the guts of writing a map/reduce job though if I did not have a solid foundation working with Hadoop the examples may have been difficult to grok.

The authors chose to use Stack Overflow community data to demonstrate the patterns presented and I felt that was an excellent decision as its easy to derive other queries to answer - and implement - having some knowledge of the corpus.

The book gives a good introduction to MapReduce design patterns. But what i found really missing are good examples.
I had studied Jimmy Lin's book [...]before i read this which gives some really good examples of algorithm design. I was hoping to find something which focussed on how some of the design patterns can be leveraged to implement more complicated and non-trivial algorithms in Map-Reduce more effectively.
But i feel that the book uses some fairly straightforward algorithms to explain the pattern and does not go deep.
Another thing that i did not like is that the book is just too much Hadoop specific and ignores other Map Reduce implementations which are getting very popular.
Overall the book is a good step in introducing patterns and algorithms in a more systematic manner, in the Map Reduce programming paradigm. It gives a good survey of some of the emerging areas in last few chapters. The chapter on Meta Patterns was my favorite as it gives some good introductory material on building more complicated pipelines using Map Reduce, and how one could take steps in optimizing the runtime of bigger pipelines.

Product Details :
Paperback: 230 pages
Publisher: O'Reilly Media; 1 edition (December 22, 2012)
Language: English
ISBN-10: 1449327176
ISBN-13: 978-1449327170
Product Dimensions: 7.5 x 0.6 x 9.2 inches

More Details about MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems 1st edition

or

Download MapReduce Design Patterns: Building Effective Algorithms and Analytics for Hadoop and Other Systems 1st edition PDF Ebook