Class: AeEasy::Core::Mock::FakeDb

Inherits:
Object
  • Object
show all
Defined in:
lib/ae_easy/core/mock/fake_db.rb

Overview

Fake in memory database that emulates `Answersengine` database objects' black box behavior.

Constant Summary collapse

PAGE_KEYS =

Page id keys, analog to primary keys.

['gid'].freeze
OUTPUT_KEYS =

Output id keys, analog to primary keys.

['_id', '_collection'].freeze
JOB_KEYS =

Job id keys, analog to primary keys.

['job_id'].freeze
JOB_STATUSES =

Job available status.

{
  active: 'active',
  done: 'done',
  cancelled: 'cancelled',
  paused: 'paused'
}
DEFAULT_COLLECTION =

Default collection for saved outputs

'default'

Class Method Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(opts = {}) ⇒ FakeDb

Initialize fake database.

Parameters:

  • opts (Hash) (defaults to: {})

    ({}) Configuration options.

Options Hash (opts):

  • :job_id (Integer, nil)

    Job id default value.

  • :scraper_name (String, nil)

    Scraper name default value.

  • :page_gid (String, nil)

    Page gid default value.

  • :allow_page_gid_override (Boolean, nil) — default: false

    Specify whenever page gid can be overrided on page or output insert.

  • :allow_job_id_override (Boolean, nil) — default: false

    Specify whenever job id can be overrided on page or output insert.



246
247
248
249
250
251
252
# File 'lib/ae_easy/core/mock/fake_db.rb', line 246

def initialize opts = {}
  self.job_id = opts[:job_id]
  self.scraper_name = opts[:scraper_name]
  self.page_gid = opts[:page_gid]
  @allow_page_gid_override = opts[:allow_page_gid_override].nil? ? false : !!opts[:allow_page_gid_override]
  @allow_job_id_override = opts[:allow_job_id_override].nil? ? false : !!opts[:allow_job_id_override]
end

Class Method Details

.build_fake_job(opts = {}) ⇒ Hash

Build a fake job by using FakeDb engine.

Parameters:

  • opts (Hash) (defaults to: {})

    ({}) Configuration options (see #initialize).

Options Hash (opts):

  • :scraper_name (String) — default: nil

    Scraper name.

  • :job_id (Integer) — default: nil

    Job id.

  • :status (String) — default: 'done'

    .

Returns:

  • (Hash)


136
137
138
139
140
141
142
143
# File 'lib/ae_easy/core/mock/fake_db.rb', line 136

def self.build_fake_job opts = {}
  job = {
    'job_id' => opts[:job_id],
    'scraper_name' => opts[:scraper_name],
    'status' => (opts[:status] || 'done')
  }
  build_job job, opts
end

.build_fake_page(opts = {}) ⇒ Hash

Build a fake page by using FakeDb engine.

Parameters:

  • opts (Hash) (defaults to: {})

    ({}) Configuration options (see #initialize).

Options Hash (opts):

  • :url (String) — default: 'https://example.com'

    Page url.

Returns:

  • (Hash)


64
65
66
67
68
69
# File 'lib/ae_easy/core/mock/fake_db.rb', line 64

def self.build_fake_page opts = {}
  page = {
    'url' => (opts[:url] || 'https://example.com')
  }
  build_page page, opts
end

.build_job(job, opts = {}) ⇒ Hash

Build a job with defaults by using FakeDb engine.

Parameters:

  • job (Hash)

    Job initial values.

  • opts (Hash) (defaults to: {})

    ({}) Configuration options (see #initialize).

Returns:

  • (Hash)


122
123
124
125
126
# File 'lib/ae_easy/core/mock/fake_db.rb', line 122

def self.build_job job, opts = {}
  temp_db = AeEasy::Core::Mock::FakeDb.new opts
  temp_db.jobs << job
  temp_db.jobs.last
end

.build_page(page, opts = {}) ⇒ Hash

Build a page with defaults by using FakeDb engine.

Parameters:

  • page (Hash)

    Page initial values.

  • opts (Hash) (defaults to: {})

    ({}) Configuration options (see #initialize).

Returns:

  • (Hash)


48
49
50
51
52
53
54
55
56
# File 'lib/ae_easy/core/mock/fake_db.rb', line 48

def self.build_page page, opts = {}
  opts = {
    allow_page_gid_override: true,
    allow_job_id_override: true
  }.merge opts
  temp_db = AeEasy::Core::Mock::FakeDb.new opts
  temp_db.pages << page
  temp_db.pages.first
end

.clean_uri(raw_url) ⇒ String

Clean an URL to remove fragment, lowercase schema and host, and sort

query string.

Parameters:

  • raw_url (String)

    URL to clean.

Returns:

  • (String)


77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
# File 'lib/ae_easy/core/mock/fake_db.rb', line 77

def self.clean_uri raw_url
  url = URI.parse(raw_url)
  url.hostname = url.hostname.downcase
  url.fragment = nil

  # Sort query string keys
  unless url.query.nil?
    query_string = CGI.parse(url.query)
    keys = query_string.keys.sort
    data = []
    keys.each do |key|
      query_string[key].each do |value|
        data << "#{URI.encode key}=#{URI.encode value}"
      end
    end
    url.query = data.join('&')
  end
  url.to_s
end

.fake_uuid(seed = nil) ⇒ String

Generate a fake UUID.

Parameters:

  • seed (nil) (defaults to: nil)

    Object to use as seed for uuid.

Returns:

  • (String)


37
38
39
40
# File 'lib/ae_easy/core/mock/fake_db.rb', line 37

def self.fake_uuid seed = nil
  seed ||= (Time.new.to_f + rand)
  Digest::SHA1.hexdigest seed.to_s
end

.new_collection(keys, opts = {}) ⇒ AeEasy::Core::SmartCollection

Generate a smart collection with keys and initial values.

Parameters:

  • keys (Array)

    Analog to primary keys, combination will be uniq.

  • opts (Hash) (defaults to: {})

    Configuration options (see AeEasy::Core::SmartCollection#initialize).

Returns:



28
29
30
# File 'lib/ae_easy/core/mock/fake_db.rb', line 28

def self.new_collection keys, opts = {}
  AeEasy::Core::SmartCollection.new keys, opts
end

Instance Method Details

#allow_job_id_override?Boolean

Specify whenever job id overriding by user is allowed on page or

output insert.

Returns:

  • (Boolean)

    `true` when allowed, else `false`.



232
233
234
# File 'lib/ae_easy/core/mock/fake_db.rb', line 232

def allow_job_id_override?
  @allow_job_id_override ||= false
end

#allow_page_gid_override?Boolean

Specify whenever page gid overriding by user is allowed on page or

output insert.

Returns:

  • (Boolean)

    `true` when allowed, else `false`.



214
215
216
# File 'lib/ae_easy/core/mock/fake_db.rb', line 214

def allow_page_gid_override?
  @allow_page_gid_override ||= false
end

#disable_job_id_overrideObject

Disable job id override on page or output insert.



224
225
226
# File 'lib/ae_easy/core/mock/fake_db.rb', line 224

def disable_job_id_override
  @allow_job_id_override = false
end

#disable_page_gid_overrideObject

Disable page gid override on page or output insert.



206
207
208
# File 'lib/ae_easy/core/mock/fake_db.rb', line 206

def disable_page_gid_override
  @allow_page_gid_override = false
end

#enable_job_id_overrideObject

Enable job id override on page or output insert.



219
220
221
# File 'lib/ae_easy/core/mock/fake_db.rb', line 219

def enable_job_id_override
  @allow_job_id_override = true
end

#enable_page_gid_overrideObject

Enable page gid override on page or output insert.



201
202
203
# File 'lib/ae_easy/core/mock/fake_db.rb', line 201

def enable_page_gid_override
  @allow_page_gid_override = true
end

#ensure_job(target_job_id = nil) ⇒ Hash

Get current job or create new one from values.

Parameters:

  • target_job_id (Integer) (defaults to: nil)

    (nil) Job id to ensure existance.

Returns:

  • (Hash)


150
151
152
153
154
155
156
157
158
159
160
161
# File 'lib/ae_easy/core/mock/fake_db.rb', line 150

def ensure_job target_job_id = nil
  target_job_id = job_id if target_job_id.nil?
  job = jobs.find{|v|v['job_id'] == target_job_id}
  return job unless job.nil?
  job = {
    'job_id' => target_job_id,
    'scraper_name' => scraper_name,
  }
  job['status'] = 'active' unless target_job_id != job_id
  jobs << job
  jobs.last
end

#generate_job_idInteger

Generate a fake job_id.

Returns:

  • (Integer)


264
265
266
# File 'lib/ae_easy/core/mock/fake_db.rb', line 264

def generate_job_id
  jobs.count < 1 ? 1 : (jobs.max{|a,b|a['job_id'] <=> b['job_id']}['job_id'] + 1)
end

#generate_output_id(data) ⇒ String

Generate a fake UUID based on output fields without `_` prefix.

Parameters:

  • data (Hash)

    Output data.

Returns:

  • (String)


381
382
383
384
# File 'lib/ae_easy/core/mock/fake_db.rb', line 381

def generate_output_id data
  seed = data.select{|k,v|k.to_s =~ /^[^_]/}.hash
  self.class.fake_uuid seed
end

#generate_page_gid(page_data) ⇒ String

Generate a fake UUID based on page data:

* url
* method
* headers
* fetch_type
* cookie
* no_redirect
* body
* ua_type

Parameters:

  • page_data (Hash)

    Page data.

Returns:

  • (String)


311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
# File 'lib/ae_easy/core/mock/fake_db.rb', line 311

def generate_page_gid page_data
  fields = [
    'url',
    'method',
    'headers',
    'fetch_type',
    'cookie',
    'no_redirect',
    'body',
    'ua_type'
  ]
  data = page_data.select{|k,v|fields.include? k}
  data['url'] = self.class.clean_uri data['url']
  data['headers'] = self.class.format_headers data['headers']
  data['cookie'] = AeEasy::Core::Helper::Cookie.parse_from_request data['cookie'] unless data['cookie'].nil?
  seed = data.select{|k,v|fields.include? k}.hash
  checksum = self.class.fake_uuid seed
  "#{URI.parse(data['url']).hostname}-#{checksum}"
end

#generate_scraper_nameString

Generate a fake scraper name.

Returns:

  • (String)


257
258
259
# File 'lib/ae_easy/core/mock/fake_db.rb', line 257

def generate_scraper_name
  Faker::Internet.unique.slug
end

#job_idInteger?

Fake job id.

Returns:

  • (Integer, nil)


178
179
180
# File 'lib/ae_easy/core/mock/fake_db.rb', line 178

def job_id
  @job_id ||= generate_job_id
end

#job_id=(value) ⇒ Object

Set fake job id value.



183
184
185
186
187
# File 'lib/ae_easy/core/mock/fake_db.rb', line 183

def job_id= value
  @job_id = value
  ensure_job
  job_id
end

#jobsAeEasy::Core::SmartCollection

Stored job collection



284
285
286
287
288
289
290
291
292
293
294
295
296
# File 'lib/ae_easy/core/mock/fake_db.rb', line 284

def jobs
  return @jobs unless @jobs.nil?
  collection = self.class.new_collection JOB_KEYS,
    defaults: job_defaults
  collection.bind_event(:before_defaults) do |collection, raw_item|
    AeEasy::Core.deep_stringify_keys raw_item
  end
  collection.bind_event(:before_insert) do |collection, item, match|
    item['job_id'] ||= generate_job_id
    item
  end
  @jobs ||= collection
end

#outputsAeEasy::Core::SmartCollection

Stored output collection



402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
# File 'lib/ae_easy/core/mock/fake_db.rb', line 402

def outputs
  return @outputs unless @outputs.nil?
  collection = self.class.new_collection OUTPUT_KEYS,
    defaults: output_defaults
  collection.bind_event(:before_defaults) do |collection, raw_item|
    item = AeEasy::Core.deep_stringify_keys raw_item
    item.delete '_job_id' unless allow_job_id_override?
    item.delete '_gid_id' unless allow_page_gid_override?
    item
  end
  collection.bind_event(:before_insert) do |collection, item, match|
    item['_id'] ||= generate_output_id item
    item
  end
  collection.bind_event(:after_insert) do |collection, item|
    ensure item['job_id']
  end
  @outputs ||= collection
end

#page_gidInteger?

Current fake page gid.

Returns:

  • (Integer, nil)


191
192
193
# File 'lib/ae_easy/core/mock/fake_db.rb', line 191

def page_gid
  @page_gid ||= self.class.fake_uuid
end

#page_gid=(value) ⇒ Object

Set current fake page gid value.



196
197
198
# File 'lib/ae_easy/core/mock/fake_db.rb', line 196

def page_gid= value
  @page_gid = value
end

#pagesAeEasy::Core::SmartCollection

Note:

Page gid will be replaced on insert by an auto generated uuid unless page gid overriding is enabled (see #allow_page_gid_override?)

Stored page collection.



357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
# File 'lib/ae_easy/core/mock/fake_db.rb', line 357

def pages
  return @pages unless @page.nil?

  collection = self.class.new_collection PAGE_KEYS,
    defaults: page_defaults
  collection.bind_event(:before_defaults) do |collection, raw_item|
    item = AeEasy::Core.deep_stringify_keys raw_item
    item.delete 'job_id' unless allow_job_id_override?
    item
  end
  collection.bind_event(:before_insert) do |collection, item, match|
    if item['gid'].nil? || !allow_page_gid_override?
      item['gid'] = generate_page_gid item
    end
    item
  end
  @pages ||= collection
end

#query(collection, filter, offset = 0, limit = nil) ⇒ Object

Note:

Warning: It uses table scan to filter and should be used on test suites only.

Search items from a collection.

Parameters:

  • collection (Symbol)

    Allowed values: `:outputs`, `:pages`.

  • filter (Hash)

    Filters to query.

  • offset (Integer) (defaults to: 0)

    (0) Search results offset.

  • limit (Integer|nil) (defaults to: nil)

    (nil) Limit search results count. Set to `nil` for unlimited.

Raises:

  • ArgumentError On unknown collection.



449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
# File 'lib/ae_easy/core/mock/fake_db.rb', line 449

def query collection, filter, offset = 0, limit = nil
  return [] unless limit.nil? || limit > 0

  # Get collection items
  items = case collection
  when :outputs
    outputs
  when :pages
    pages
  when :jobs
    jobs
  else
    raise ArgumentError.new "Unknown collection #{collection}."
  end

  # Search items
  count = 0
  matches = []
  items.each do |item|
    next unless match? item, filter
    count += 1

    # Skip until offset
    next unless offset < count
    # Break on limit reach
    break unless limit.nil? || matches.count < limit
    matches << item
  end
  matches
end

#scraper_nameString?

Fake scraper_name.

Returns:

  • (String, nil)


165
166
167
# File 'lib/ae_easy/core/mock/fake_db.rb', line 165

def scraper_name
  @scraper_name ||= 'my_scraper'
end

#scraper_name=(value) ⇒ Object

Set fake scraper_name value.



170
171
172
173
174
# File 'lib/ae_easy/core/mock/fake_db.rb', line 170

def scraper_name= value
  job = ensure_job
  @scraper_name = value
  job['scraper_name'] = scraper_name
end