×
Aiming for the stars? Great, but did you have to build a rocket first.
--Your friends at LectureNotes
Close

Note for Internet of Things - IOT by SHIVA PRASAD DAS

  • Internet of Things - IOT
  • Note
  • Silicon Institute Of Technology SIT - SIT
  • 1411 Views
  • 17 Offline Downloads
  • Uploaded 3 months ago
Shiva Prasad Das
Shiva Prasad Das
0 User(s)
Download PDFOrder Printed Copy

Share it with your friends

Leave your Comments

Text from page-1

Chapter 10 Data Analytics for IoT Book website: http://www.internet-of-things-book.com Bahga & Madisetti, © 2015

Text from page-2

Outline • Overview of Hadoop ecosystem • MapReduce architecture • MapReduce job execution flow • MapReduce schedulers Book website: http://www.internet-of-things-book.com Bahga & Madisetti, © 2015

Text from page-3

Hadoop Ecosystem • Apache Hadoop is an open source framework for distributed batch processing of big data. • Hadoop Ecosystem includes: • • • • • • • • • • • • • • Hadoop MapReduce HDFS YARN HBase Zookeeper Pig Hive Mahout Chukwa Cassandra Avro Oozie Flume Sqoop Book website: http://www.internet-of-things-book.com Bahga & Madisetti, © 2015

Text from page-4

Apache Hadoop • A Hadoop cluster comprises of a Master node, backup node and a number of slave nodes. • The master node runs the NameNode and JobTracker processes and the slave nodes run the DataNode and TaskTracker components of Hadoop. • The backup node runs the Secondary NameNode process. • NameNode • NameNode keeps the directory tree of all files in the file system, and tracks where across the cluster the file data is kept. It does not store the data of these files itself. Client applications talk to the NameNode whenever they wish to locate a file, or when they want to add/copy/move/delete a file. • Secondary NameNode • NameNode is a Single Point of Failure for the HDFS Cluster. An optional Secondary NameNode which is hosted on a separate machine creates checkpoints of the namespace. • JobTracker • The JobTracker is the service within Hadoop that distributes MapReduce tasks to specific nodes in the cluster, ideally the nodes that have the data, or at least are in the same rack. Book website: http://www.internet-of-things-book.com Bahga & Madisetti, © 2015

Lecture Notes