Pig is right for you if you:
-
Need to analyze data that is small (kilobytes), tall (megabytes), grande (gigabytes), or venti (terabytes).
-
Want to be able to create, modify and reuse your analysis logic easily.
-
Process one data set at a time
-
… or need to combine multiple data sets.
-
Do simple processing (e.g., count the number of images on the web)
-
… or complex processing (e.g., count the number of images that contain faces).
Pig is not right for you if you:
-
Need to retrieve individual records, or small ranges of records, from a very large data set (e.g., lookup Joe Smith’s customer profile). 【不支持多输出或者从一非常多数据中找一项东西(这个SQL足以) 】
-
Have real-time data serving requirements (e.g., assemble a web page for Joe in under 100ms).【实时性不强,这点就是pagerank的弱点之一】 这个得仔细考虑下关于我考虑的那个实时计算的问题了 -_-!!
-
Need to be able to do random writes to specific data records.
-
Don’t like barnyard animals.



Post comment