Sha256: 088cc019ec5e420164d90a19fb13ce9ca93693f4a9eefaecb6d8bd9620340a8d

Contents?: true

Size: 1.18 KB

Versions: 10

Compression:

Stored size: 1.18 KB

Contents

#
# Group all visitors, and then troll through all the pages they've visited
# breaking each into distinct visits (where more than an [hour|day|whatever]
# separate subsequent pageviews
#

#
# Mapper parses log files and created a visitor_id from the visitor's user_id,
# cookie or ip. It emits
#
#    <visitor_id>  <datetime>   <url_path>
#
# where the partition key is visitor_id, and we sort by visitor_id and datetime.
#

#
# Reducer:
#
# The reducer is given all page requests for the given visitor id, sorted by
# timestamp.
#
# It group by visits (pageviews separated by more than DISTINCT_VISIT_TIMEGAP)
# and emits
#
#     trail        <visitor_id> <n_pages_in_visit> <duration> <timestamp> < page1,page2,... >
#
# where the last is a comma-separated string of URL encoded paths (any internal comma is converted to %2C).
#
# You can instead emit
#
#     page_trails  <page1>      <n_pages_in_visit> <duration> <timestamp> < page1,page2,... >
#     page_trails  <page2>      <n_pages_in_visit> <duration> <timestamp> < page1,page2,... >
#     ....
#     page_trails  <pagen>      <n_pages_in_visit> <duration> <timestamp> < page1,page2,... >
#
# to discover all trails passing through a given page.

Version data entries

10 entries across 10 versions & 1 rubygems

Version Path
wukong-1.5.4 examples/server_logs/breadcrumbs.rb
wukong-1.5.3 examples/server_logs/breadcrumbs.rb
wukong-1.5.2 examples/server_logs/breadcrumbs.rb
wukong-1.5.1 examples/server_logs/breadcrumbs.rb
wukong-1.5.0 examples/server_logs/breadcrumbs.rb
wukong-1.4.12 examples/server_logs/breadcrumbs.rb
wukong-1.4.11 examples/server_logs/breadcrumbs.rb
wukong-1.4.10 examples/server_logs/breadcrumbs.rb
wukong-1.4.9 examples/server_logs/breadcrumbs.rb
wukong-1.4.7 examples/server_logs/breadcrumbs.rb