Posts

uses of spark and hadoop

  uses where Hadoop fits best :  Analyzing archive dataYarn allows parallel processing of a large amount of data. Parts of data are processed parallelly and separated on different nodes. If Instant results are not required. Hadoop match processing is a good and economical solution for batch processing.  Hadoop components :  HDFS, a system for putting big data across various nodes in a classified architecture. NameNode, the system that controls and runs the DataNodes, reading the metadata of all the records and every move completed in the cluster. DataNodes, the system running on each device that store the actual data, assist read and write requests from clients, and maintain data blocks. YARN, a component that makes all processing actions by designating resources and scheduling tasks through Resource and Node Manager. MapReduce, a component that does all the necessary calculations and data processing over the Hadoop cluster. Spark components: Spark Core, the component for large-scale c