- Learning Functional Programming in Go
- Lex Sheehan
- 134字
- 2021-07-02 23:13:49
MapReduce
MapReduce is a technique that splits big datasets into many smaller ones. Each small dataset is separately, but simultaneously processed on different servers. The results are then gathered and aggregated to produce a final result.
How does it work?
Suppose we have a lot of web servers and we want to determine the top requested pages across all of them. We can analyze web server access logs to find all the requested URLs, count them, and sort the results.
The following are the good use cases for MapReduce:
- Gathering statistics from servers, for example, top 10 users, top 10 requested URL
- Compute the frequencies of all keywords found in your data
The following are the use cases not good for MapReduce:
- Jobs that require shared state
- Finding inpidual records
- Small data