www.HadoopExam.com

HadoopExam Learning Resources

CCD-410 Certifcation CCA-500 Hadoop Administrator Exam HBase Certifcation CCB-400 Data Science Certifcation Hadoop Training with Hands On Lab Hadoop Package Deal

HBase Interview Questions - HBase Questions 2

 

Q11. Which of the API command you will use to read data from HBase.
Ans : Get
exmaple
Get g = new Get(Bytes.toBytes("John Smith"));
Result r = usersTable.get(g);

Ha do op Ex m

Q12. What is the BlcokCache ?
Ans : HBase also use the cache where it keeps the most used data in JVM Heap, along side Memstore. d.    The BlockCache is designed to keep frequently accessed data from the HFiles in memory so as to avoid disk reads. Each column family has its own BlockCache
The “Block” in BlockCache is the unit of data that HBase reads from disk in a single pass. The HFile is physically laid out as a sequence of blocks plus an index over those blocks. f.    This means reading a block from HBase requires only looking up that block’s location in the index and retrieving it from disk. The block is the smallest indexed unit of data and is the smallest unit of data that can be read from disk.
Hadoopexam.com


Q13. BlcokSize is configured on which level ?
Ans : The block size is configured per column family, and the default value is 64 KB. You may want to tweak this value larger or smaller depending on your use case.

Q14. If your requirement is to read the data randomly from HBase User table. Then what would be your preference to keep blcok size.
Ans : Smaller, i.    Having smaller blocks creates a larger index and thereby consumes more memory. If you frequently perform sequential scans, reading many blocks at a time, you can afford a larger block size. This allows you to save on memory because larger blocks mean fewer index entries and thus a smaller index.
Had o o p e x a m . c o m

Q15. What is a block, in a BlockCache ?
Ans : The “Bock” in BlockCache is the unit of data that HBase reads from disk in a single pass. The HFile is physically laid out as a sequence of blocks plus an index over those blocks.
This means reading a block from HBase requires only looking up that block’s location in the index and retrieving it from disk. The block is the smallest indexed unit of data and is the smallest unit of data that can be read from disk.
The block size is configured per column family, and the default value is 64 KB. You may want to tweak this value larger or smaller depending on your use case.

Q16. While reading the data from HBase, from which three places data will be reconciled before returning the value ?
Ans: a. Reading a row from HBase requires first checking the MemStore for any pending modifications.
b. Then the BlockCache is examined to see if the block containing this row has been recently accessed.
c. Finally, the relevant HFiles on disk are accessed.
d. Note that HFiles contain a snapshot of the MemStore at the point when it was flushed. Data for a complete row can be stored across multiple HFiles.
e. In order to read a complete row, HBase must read across all HFiles that might contain information for that row in order to compose the complete record.

Q17. Once you delete the data in HBase, when exactly they are physically removed ?
Ans : During Major compaction, b. Because HFiles are immutable, it’s not until a major compaction runs that these tombstone records are reconciled and space is truly recovered from deleted records.

Q18. Please describe minor compaction
Ans : Minor : A minor compaction folds HFiles together, creating a larger HFile from multiple smaller HFiles.

Q19. Please describe major compactation ?
Ans : When a compaction operates over all HFiles in a column family in a given region, it’s called a major compaction. Upon completion of a major compaction, all HFiles in the column family are merged into a single file

Q20. What is tombstone record ?
Ans : The Delete command doesn’t delete the value immediately. Instead, it marks the record for deletion. That is, a new “tombstone” record is written for that value, marking it as deleted. The tombstone is used to indicate that the deleted value should no longer be included in Get or Scan results. 

You are here: Home Interview Questions HBase Interview Questions