Class: AeEasy::Core::Mock::FakeDb

Inherits:
Object
  • Object
show all
Defined in:
lib/ae_easy/core/mock/fake_db.rb

Overview

Fake in memory database that emulates `Answersengine` database objects' black box behavior.

Constant Summary collapse

PAGE_KEYS =

Page id keys, analog to primary keys.

['gid'].freeze
OUTPUT_KEYS =

Output id keys, analog to primary keys.

['_id', '_collection'].freeze
JOB_KEYS =

Job id keys, analog to primary keys.

['job_id'].freeze
JOB_STATUSES =

Job available status.

{
  active: 'active',
  done: 'done',
  cancelled: 'cancelled',
  paused: 'paused'
}
DEFAULT_COLLECTION =

Default collection for saved outputs

'default'

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(opts = {}) ⇒ FakeDb

Initialize fake database.

Parameters:

  • opts (Hash) (defaults to: {})

    ({}) Configuration options.

Options Hash (opts):

  • :job_id (Integer, nil)

    Job id default value.

  • :scraper_name (String, nil)

    Scraper name default value.

  • :page_gid (String, nil)

    Page gid default value.

  • :allow_page_gid_override (Boolean, nil) — default: false

    Specify whenever page gid can be overrided on page or output insert.

  • :allow_job_id_override (Boolean, nil) — default: false

    Specify whenever job id can be overrided on page or output insert.



256
257
258
259
260
261
262
# File 'lib/ae_easy/core/mock/fake_db.rb', line 256

def initialize opts = {}
  self.job_id = opts[:job_id]
  self.scraper_name = opts[:scraper_name]
  self.page_gid = opts[:page_gid]
  @allow_page_gid_override = opts[:allow_page_gid_override].nil? ? false : !!opts[:allow_page_gid_override]
  @allow_job_id_override = opts[:allow_job_id_override].nil? ? false : !!opts[:allow_job_id_override]
end

Class Method Details

.build_fake_job(opts = {}) ⇒ Hash

Build a fake job by using FakeDb engine.

Parameters:

  • opts (Hash) (defaults to: {})

    ({}) Configuration options (see #initialize).

Options Hash (opts):

  • :scraper_name (String) — default: nil

    Scraper name.

  • :job_id (Integer) — default: nil

    Job id.

  • :status (String) — default: 'done'

    .

Returns:

  • (Hash)


146
147
148
149
150
151
152
153
# File 'lib/ae_easy/core/mock/fake_db.rb', line 146

def self.build_fake_job opts = {}
  job = {
    'job_id' => opts[:job_id],
    'scraper_name' => opts[:scraper_name],
    'status' => (opts[:status] || 'done')
  }
  build_job job, opts
end

.build_fake_page(opts = {}) ⇒ Hash

Build a fake page by using FakeDb engine.

Parameters:

  • opts (Hash) (defaults to: {})

    ({}) Configuration options (see #initialize).

Options Hash (opts):

  • :url (String) — default: 'https://example.com'

    Page url.

Returns:

  • (Hash)


74
75
76
77
78
79
# File 'lib/ae_easy/core/mock/fake_db.rb', line 74

def self.build_fake_page opts = {}
  page = {
    'url' => (opts[:url] || 'https://example.com')
  }
  build_page page, opts
end

.build_job(job, opts = {}) ⇒ Hash

Build a job with defaults by using FakeDb engine.

Parameters:

  • job (Hash)

    Job initial values.

  • opts (Hash) (defaults to: {})

    ({}) Configuration options (see #initialize).

Returns:

  • (Hash)


132
133
134
135
136
# File 'lib/ae_easy/core/mock/fake_db.rb', line 132

def self.build_job job, opts = {}
  temp_db = AeEasy::Core::Mock::FakeDb.new opts
  temp_db.jobs << job
  temp_db.jobs.last
end

.build_page(page, opts = {}) ⇒ Hash

Build a page with defaults by using FakeDb engine.

Parameters:

  • page (Hash)

    Page initial values.

  • opts (Hash) (defaults to: {})

    ({}) Configuration options (see #initialize).

Returns:

  • (Hash)


58
59
60
61
62
63
64
65
66
# File 'lib/ae_easy/core/mock/fake_db.rb', line 58

def self.build_page page, opts = {}
  opts = {
    allow_page_gid_override: true,
    allow_job_id_override: true
  }.merge opts
  temp_db = AeEasy::Core::Mock::FakeDb.new opts
  temp_db.pages << page
  temp_db.pages.first
end

.clean_uri(raw_url) ⇒ String

Clean an URL to remove fragment, lowercase schema and host, and sort

query string.

Parameters:

  • raw_url (String)

    URL to clean.

Returns:

  • (String)


87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
# File 'lib/ae_easy/core/mock/fake_db.rb', line 87

def self.clean_uri raw_url
  url = URI.parse(raw_url)
  url.hostname = url.hostname.downcase
  url.fragment = nil

  # Sort query string keys
  unless url.query.nil?
    query_string = CGI.parse(url.query)
    keys = query_string.keys.sort
    data = []
    keys.each do |key|
      query_string[key].each do |value|
        data << "#{URI.encode key}=#{URI.encode value}"
      end
    end
    url.query = data.join('&')
  end
  url.to_s
end

.fake_uuid(seed = nil) ⇒ String

Generate a fake UUID.

Parameters:

  • seed (nil) (defaults to: nil)

    Object to use as seed for uuid.

Returns:

  • (String)


37
38
39
40
# File 'lib/ae_easy/core/mock/fake_db.rb', line 37

def self.fake_uuid seed = nil
  seed ||= (Time.new.to_f + rand)
  Digest::SHA1.hexdigest seed.to_s
end

.new_collection(keys, opts = {}) ⇒ AeEasy::Core::SmartCollection

Generate a smart collection with keys and initial values.

Parameters:

  • keys (Array)

    Analog to primary keys, combination will be uniq.

  • opts (Hash) (defaults to: {})

    Configuration options (see AeEasy::Core::SmartCollection#initialize).

Returns:



28
29
30
# File 'lib/ae_easy/core/mock/fake_db.rb', line 28

def self.new_collection keys, opts = {}
  AeEasy::Core::SmartCollection.new keys, opts
end

.output_uuid(data) ⇒ String

Generate a fake UUID based on output fields without `_` prefix.

Parameters:

  • data (Hash)

    Output data.

Returns:

  • (String)


47
48
49
50
# File 'lib/ae_easy/core/mock/fake_db.rb', line 47

def self.output_uuid data
  seed = data.select{|k,v|k.to_s =~ /^[^_]/}.hash
  fake_uuid seed
end

Instance Method Details

#allow_job_id_override?Boolean

Specify whenever job id overriding by user is allowed on page or

output insert.

Returns:

  • (Boolean)

    `true` when allowed, else `false`.



242
243
244
# File 'lib/ae_easy/core/mock/fake_db.rb', line 242

def allow_job_id_override?
  @allow_job_id_override ||= false
end

#allow_page_gid_override?Boolean

Specify whenever page gid overriding by user is allowed on page or

output insert.

Returns:

  • (Boolean)

    `true` when allowed, else `false`.



224
225
226
# File 'lib/ae_easy/core/mock/fake_db.rb', line 224

def allow_page_gid_override?
  @allow_page_gid_override ||= false
end

#disable_job_id_overrideObject

Disable job id override on page or output insert.



234
235
236
# File 'lib/ae_easy/core/mock/fake_db.rb', line 234

def disable_job_id_override
  @allow_job_id_override = false
end

#disable_page_gid_overrideObject

Disable page gid override on page or output insert.



216
217
218
# File 'lib/ae_easy/core/mock/fake_db.rb', line 216

def disable_page_gid_override
  @allow_page_gid_override = false
end

#enable_job_id_overrideObject

Enable job id override on page or output insert.



229
230
231
# File 'lib/ae_easy/core/mock/fake_db.rb', line 229

def enable_job_id_override
  @allow_job_id_override = true
end

#enable_page_gid_overrideObject

Enable page gid override on page or output insert.



211
212
213
# File 'lib/ae_easy/core/mock/fake_db.rb', line 211

def enable_page_gid_override
  @allow_page_gid_override = true
end

#ensure_job(target_job_id = nil) ⇒ Hash

Get current job or create new one from values.

Parameters:

  • target_job_id (Integer) (defaults to: nil)

    (nil) Job id to ensure existance.

Returns:

  • (Hash)


160
161
162
163
164
165
166
167
168
169
170
171
# File 'lib/ae_easy/core/mock/fake_db.rb', line 160

def ensure_job target_job_id = nil
  target_job_id = job_id if target_job_id.nil?
  job = jobs.find{|v|v['job_id'] == target_job_id}
  return job unless job.nil?
  job = {
    'job_id' => target_job_id,
    'scraper_name' => scraper_name,
  }
  job['status'] = 'active' unless target_job_id != job_id
  jobs << job
  jobs.last
end

#generate_job_idInteger

Generate a fake job_id.

Returns:

  • (Integer)


274
275
276
# File 'lib/ae_easy/core/mock/fake_db.rb', line 274

def generate_job_id
  jobs.count < 1 ? 1 : (jobs.max{|a,b|a['job_id'] <=> b['job_id']}['job_id'] + 1)
end

#generate_output_id(data) ⇒ String

Generate a fake UUID for outputs.

Parameters:

  • data (Hash)

    Output data.

Returns:

  • (String)


394
395
396
397
# File 'lib/ae_easy/core/mock/fake_db.rb', line 394

def generate_output_id data
  # Generate random UUID to match AnswersEngine behavior
  self.class.fake_uuid
end

#generate_page_gid(page_data) ⇒ String

Generate a fake UUID based on page data:

* url
* method
* headers
* fetch_type
* cookie
* no_redirect
* body
* ua_type

Parameters:

  • page_data (Hash)

    Page data.

Returns:

  • (String)


321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
# File 'lib/ae_easy/core/mock/fake_db.rb', line 321

def generate_page_gid page_data
  fields = [
    'url',
    'method',
    'headers',
    'fetch_type',
    'cookie',
    'no_redirect',
    'body',
    'ua_type'
  ]
  data = page_data.select{|k,v|fields.include? k}
  data['url'] = self.class.clean_uri data['url']
  data['headers'] = self.class.format_headers data['headers']
  data['cookie'] = AeEasy::Core::Helper::Cookie.parse_from_request data['cookie'] unless data['cookie'].nil?
  seed = data.select{|k,v|fields.include? k}.hash
  checksum = self.class.fake_uuid seed
  "#{URI.parse(data['url']).hostname}-#{checksum}"
end

#generate_scraper_nameString

Generate a fake scraper name.

Returns:

  • (String)


267
268
269
# File 'lib/ae_easy/core/mock/fake_db.rb', line 267

def generate_scraper_name
  Faker::Internet.unique.slug
end

#job_idInteger?

Fake job id.

Returns:

  • (Integer, nil)


188
189
190
# File 'lib/ae_easy/core/mock/fake_db.rb', line 188

def job_id
  @job_id ||= generate_job_id
end

#job_id=(value) ⇒ Object

Set fake job id value.



193
194
195
196
197
# File 'lib/ae_easy/core/mock/fake_db.rb', line 193

def job_id= value
  @job_id = value
  ensure_job
  job_id
end

#jobsAeEasy::Core::SmartCollection

Stored job collection



294
295
296
297
298
299
300
301
302
303
304
305
306
# File 'lib/ae_easy/core/mock/fake_db.rb', line 294

def jobs
  return @jobs unless @jobs.nil?
  collection = self.class.new_collection JOB_KEYS,
    defaults: job_defaults
  collection.bind_event(:before_defaults) do |collection, raw_item|
    AeEasy::Core.deep_stringify_keys raw_item
  end
  collection.bind_event(:before_insert) do |collection, item, match|
    item['job_id'] ||= generate_job_id
    item
  end
  @jobs ||= collection
end

#outputsAeEasy::Core::SmartCollection

Stored output collection



415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
# File 'lib/ae_easy/core/mock/fake_db.rb', line 415

def outputs
  return @outputs unless @outputs.nil?
  collection = self.class.new_collection OUTPUT_KEYS,
    defaults: output_defaults
  collection.bind_event(:before_defaults) do |collection, raw_item|
    item = AeEasy::Core.deep_stringify_keys raw_item
    item.delete '_job_id' unless allow_job_id_override?
    item.delete '_gid_id' unless allow_page_gid_override?
    item
  end
  collection.bind_event(:before_insert) do |collection, item, match|
    item['_id'] ||= generate_output_id item
    item
  end
  collection.bind_event(:after_insert) do |collection, item|
    ensure_job item['_job_id']
  end
  @outputs ||= collection
end

#page_gidInteger?

Current fake page gid.

Returns:

  • (Integer, nil)


201
202
203
# File 'lib/ae_easy/core/mock/fake_db.rb', line 201

def page_gid
  @page_gid ||= self.class.fake_uuid
end

#page_gid=(value) ⇒ Object

Set current fake page gid value.



206
207
208
# File 'lib/ae_easy/core/mock/fake_db.rb', line 206

def page_gid= value
  @page_gid = value
end

#pagesAeEasy::Core::SmartCollection

Note:

Page gid will be replaced on insert by an auto generated uuid unless page gid overriding is enabled (see #allow_page_gid_override?)

Stored page collection.



367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
# File 'lib/ae_easy/core/mock/fake_db.rb', line 367

def pages
  return @pages unless @page.nil?

  collection = self.class.new_collection PAGE_KEYS,
    defaults: page_defaults
  collection.bind_event(:before_defaults) do |collection, raw_item|
    item = AeEasy::Core.deep_stringify_keys raw_item
    item.delete 'job_id' unless allow_job_id_override?
    item
  end
  collection.bind_event(:before_insert) do |collection, item, match|
    if item['gid'].nil? || !allow_page_gid_override?
      item['gid'] = generate_page_gid item
    end
    item
  end
  collection.bind_event(:after_insert) do |collection, item|
    ensure_job item['job_id']
  end
  @pages ||= collection
end

#query(collection, filter, offset = 0, limit = nil) ⇒ Object

Note:

Warning: It uses table scan to filter and should be used on test suites only.

Search items from a collection.

Parameters:

  • collection (Symbol)

    Allowed values: `:outputs`, `:pages`.

  • filter (Hash)

    Filters to query.

  • offset (Integer) (defaults to: 0)

    (0) Search results offset.

  • limit (Integer, nil) (defaults to: nil)

    (nil) Limit search results count. Set to `nil` for unlimited.

Raises:

  • ArgumentError On unknown collection.



462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
# File 'lib/ae_easy/core/mock/fake_db.rb', line 462

def query collection, filter, offset = 0, limit = nil
  return [] unless limit.nil? || limit > 0

  # Get collection items
  items = case collection
  when :outputs
    outputs
  when :pages
    pages
  when :jobs
    jobs
  else
    raise ArgumentError.new "Unknown collection #{collection}."
  end

  # Search items
  count = 0
  matches = []
  items.each do |item|
    next unless match? item, filter
    count += 1

    # Skip until offset
    next unless offset < count
    # Break on limit reach
    break unless limit.nil? || matches.count < limit
    matches << item
  end
  matches
end

#scraper_nameString?

Fake scraper_name.

Returns:

  • (String, nil)


175
176
177
# File 'lib/ae_easy/core/mock/fake_db.rb', line 175

def scraper_name
  @scraper_name ||= 'my_scraper'
end

#scraper_name=(value) ⇒ Object

Set fake scraper_name value.



180
181
182
183
184
# File 'lib/ae_easy/core/mock/fake_db.rb', line 180

def scraper_name= value
  job = ensure_job
  @scraper_name = value
  job['scraper_name'] = scraper_name
end