Skip to content

[WIP] Add base64 serialization #28

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

FabienChaynes
Copy link
Collaborator

Hi,

This PR adds Mongoid::Scroll::Cursor#to_base64 and Mongoid::Scroll::Cursor.from_base64 convenience methods to easily serialize the Mongoid::Scroll::Cursor to an URL-friendly string.

The goal is to use the mongoid-scroll gem in an API to build a cursor based pagination. The API response would look like this:

{
  "records": [
   [...]
  ],
  "paging": {
    "next_page_token": "eyJ2YWx1ZSI6IjU0MTAxNGUxN2FhNThkOGNjZjAwMDAyNTo1NDEwMTRlMTdhYTU4ZDhjY2YwMDAwMjUiLCJmaWVsZF90eXBlIjoiQlNPTjo6T2JqZWN0SWQiLCJmaWVsZF9uYW1lIjoiX2lkIiwiZGlyZWN0aW9uIjoxfQ=="
  }
}

Instead of implementing this mechanism in the application, I thought that it might be useful to integrate it to the gem.

@@ -23,6 +23,10 @@ def criteria
cursor_selector.__evolve_object_id__
end

def to_base64
Base64.strict_encode64({ value: to_s, field_type: field_type, field_name: field_name, direction: direction }.to_json)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Base64.strict_encode64 to follow RFC 4648 and have an URL-friendly string.

@@ -8,6 +8,16 @@
its(:tiebreak_id) { should be_nil }
its(:value) { should be_nil }
its(:criteria) { should eq({}) }
describe 'base64' do
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure it follows the spec style of this gem. Feel free to correct me, and I'll add other specs after your feedback.

@@ -31,6 +35,13 @@ def from_record(record, options)
cursor.tiebreak_id = record['_id']
cursor
end

def from_base64(str)
Copy link
Collaborator Author

@FabienChaynes FabienChaynes Feb 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll be the caller responsibility to either:

  • encrypt/sign the Base64 string before making it public to be sure that it's not modified
  • perform a check after deserializing the cursor to make sure that Mongoid::Scroll::Cursor params are allowed to sort (field_name, direction, etc...)

The README should probably be updated to mention this (in addition to a description of the new feature) if this PR is worth merging.

@dblock
Copy link
Collaborator

dblock commented Feb 22, 2023

  1. I like the idea of adding multiple ways to serialize a cursor, and documenting that cursor.to_s has the drawbacks that you describe.
  2. Is there another way to solve for the potentially invalid/unwanted sort fields and direction than signing the cursor? Where do we tell users to check? Can the library check that the sort fields match the original sort?
  3. I think we don't want to add a bunch of methods like to_base64 to the cursor class itself. What do you think about subclassing it? We could have Base64EncodedCursor, or SignedBase64EncodedCursor that would use an internal ID or an externally configured key for signing? Then override to_s.

@FabienChaynes
Copy link
Collaborator Author

FabienChaynes commented Feb 22, 2023

Is there another way to solve for the potentially invalid/unwanted sort fields and direction than signing the cursor?

2.a. The gem user could also encrypt it (from their app) before making it public. For example, something like:

crypt = ActiveSupport::MessageEncryptor.new("this secret must be 32 bytes 123")
encrypted = crypt.encrypt_and_sign("eyJ2YWx1ZSI6IjU0MTAxNGUxN2FhNThkOGNjZjAwMDAyNTo1NDEwMTRlMTdhYTU4ZDhjY2YwMDAwMjUiLCJmaWVsZF90eXBlIjoiQlNPTjo6T2JqZWN0SWQiLCJmaWVsZF9uYW1lIjoiX2lkIiwiZGlyZWN0aW9uIjoxfQ==")
# encrypted can now be shared publicly

# Decrypt when you receive the encrypted string from your end user
crypt.decrypt_and_verify(encrypted)

This one uses ActiveSupport::MessageEncryptor, I'm not stating it's a secure way, but the gem user could basically use whatever they want to encrypt the string returned by to_base64.


Where do we tell users to check?

2.b. If the gem user doesn't want to encrypt, they can verify the cursor params (from their app) after deserializing. For example:

cursor = Mongoid::Scroll::Cursor.from_base64("eyJ2YWx1ZSI6IjU0MTAxNGUxN2FhNThkOGNjZjAwMDAyNTo1NDEwMTRlMTdhYTU4ZDhjY2YwMDAwMjUiLCJmaWVsZF90eXBlIjoiQlNPTjo6T2JqZWN0SWQiLCJmaWVsZF9uYW1lIjoiX2lkIiwiZGlyZWN0aW9uIjoxfQ==")
raise "Cursor can't use another field_name than id" unless cursor.field_name == "_id"

This should be stated in the README when describing the Mongoid::Scroll::Cursor.from_base64 method.


Can the library check that the sort fields match the original sort?

2.c. That's an interesting idea. I think it could be implemented in all the *::Scrollable classes. But it would break backward compatibility because currently, nothing prevents it. One way would be to enforce it only if some new option is set when calling the scroll methods. In 27bd29c I implemented a basic version for Mongoid::Criteria::Scrollable, here's what it does:

[5] pry(main)> cursor = nil; Feed::Item.all.limit(2).scroll.map(&:id)
=> [BSON::ObjectId('541014e17aa58d8ccf000025'), BSON::ObjectId('541015010f4ca111df0000b0')]

[6] pry(main)> cursor
=> nil

[7] pry(main)> cursor = nil; Feed::Item.all.limit(2).scroll {|_, c| cursor = c}; cursor
=> #<Mongoid::Scroll::Cursor:0x0000562b394c55a8
 @direction=1,
 @field_name="_id",
 @field_type="BSON::ObjectId",
 @tiebreak_id=BSON::ObjectId('541015010f4ca111df0000b0'),
 @value=BSON::ObjectId('541015010f4ca111df0000b0')>

[8] pry(main)> Feed::Item.all.limit(2).scroll(cursor).map(&:id)
=> [BSON::ObjectId('54101d4d0f4ca1fd04000056'), BSON::ObjectId('54101d960f4ca115b700001e')]

[9] pry(main)> cursor.direction = -1
=> -1

[10] pry(main)> Feed::Item.all.limit(2).scroll(cursor).map(&:id)
Exception: Cursor not following the original sort: [{:field_type=>"BSON::ObjectId", :field_name=>"_id", :direction=>1}, {:field_type=>"BSON::ObjectId", :field_name=>"_id", :direction=>-1}]

What do you think about subclassing it?

3.a. In theory, it makes sense. The issue is that Mongo::Scrollable, Mongoid::Criteria::Scrollable and Moped::Scrollable are building new Mongoid::Scroll::Cursor to yield them. How could we state that we want them to use another class? By adding a new optional argument to the scroll methods? I tried in 73bb694 but I don't really like it. It needs to be passed down the call chain, introduces a bunch of changes and is not really elegant. We could make it a part of options and delete it from the hash, but it doesn't make it really better. Did you have another solution in mind? FTR, here's what it looks like after my change:

[1] pry(main)> cursor = nil; Feed::Item.limit(2).scroll(nil, Mongoid::Scroll::Base64EncodedCursor) {|_, c| cursor = c}; cursor
=> #<Mongoid::Scroll::Base64EncodedCursor:0x000056175ba7f468
 @direction=1,
 @field_name="_id",
 @field_type="BSON::ObjectId",
 @tiebreak_id=BSON::ObjectId('541015010f4ca111df0000b0'),
 @value=BSON::ObjectId('541015010f4ca111df0000b0')>
[2] pry(main)> cursor.to_s
=> "eyJ2YWx1ZSI6IjU0MTAxNTAxMGY0Y2ExMTFkZjAwMDBiMDo1NDEwMTUwMTBmNGNhMTExZGYwMDAwYjAiLCJmaWVsZF90eXBlIjoiQlNPTjo6T2JqZWN0SWQiLCJmaWVsZF9uYW1lIjoiX2lkIiwiZGlyZWN0aW9uIjoxfQ=="

or SignedBase64EncodedCursor that would use an internal ID or an externally configured key for signing?

3.b. IDK if it's really the role of the gem to chose signing algorithms. We could pass a block which would be used to sign the hash, but it's maybe a bit convoluted (I tried it in b22a195):

> cursor = nil; Feed::Item.all.limit(2).scroll(nil, Mongoid::Scroll::SignedBase64EncodedCursor) {|_, c| cursor = c}; cursor
> p = Proc.new { |s| OpenSSL::HMAC.hexdigest("SHA256", key, s) }
> e = cursor.to_s(&p)
=> "eyJ2YWx1ZSI6IjU0MTAxNTAxMGY0Y2ExMTFkZjAwMDBiMDo1NDEwMTUwMTBmNGNhMTExZGYwMDAwYjAiLCJmaWVsZF90eXBlIjoiQlNPTjo6T2JqZWN0SWQiLCJmaWVsZF9uYW1lIjoiX2lkIiwiZGlyZWN0aW9uIjoxLCJzaWduIjoiYzNhNzU0MDg3MGU1OTVkYjE3YjU4ZGVhZTAzZWRkMTQyY2U1MzY4ZGY4NjI0NTg5NzM1NmMwOWRkNTU3ZWY4OSJ9"

> Mongoid::Scroll::SignedBase64EncodedCursor.deserialize(e, &p)
=> #<Mongoid::Scroll::SignedBase64EncodedCursor:0x000055a292ccc8d0
 @direction=1,
 @field_name="_id",
 @field_type="BSON::ObjectId",
 @tiebreak_id=BSON::ObjectId('541015010f4ca111df0000b0'),
 @value="541015010f4ca111df0000b0">

> Mongoid::Scroll::SignedBase64EncodedCursor.deserialize(e) { |_| "invalid" }
Exception: Invalid signature

FTR, the commits I referenced are not meant to be deeply reviewed. They are quick and dirty tests and mainly for demonstration purpose. If we don't want to go this way, we can just revert them.

@dblock
Copy link
Collaborator

dblock commented Feb 23, 2023

First, thank you for coming here and making all these proposals! Appreciate your patience and detailed suggestions, please don't take my pushing back as too nitpicky, do lmk when it becomes that way. I'm thinking out loud with you, and I'd love to work through each one of your suggestions and try to get the best solutions in. These are interesting problems to me!

  • For 2a and 2b I think you should/can PR a documentation change with examples using the current implementation (cursor constructor instead of from_base64). That would immediately call out the pitfalls and that's very useful.
  • For 2c. Do you see any valid reason why someone would want this feature (to pass a cursor with different direction or sort criteria)? I can't come up with a valid scenario, and is likely an error, so I think this is a good separate change as a new feature, minor version increment. I would implement this similarly to raise_multiple_sort_fields_error if multiple_sort_fields?, separate PR. We can introduce options, too, but let's have a good reason for it.
  • For 3a. I think the solution is to create scrollable modules, e.g. Mongoid::Scrollable::Base64Encoded that extends/includes/inherits Mongoid::Scrollable and supplies the cursor class.
  • For 3b I think we need to provide a few different often used Base64EncodedCursor, SignedCursor, along with a way to use them, and all these cursors need to implement parse and to_s differently. Let's start with base64 encoding, once we have that nailed down we can talk about encrypting/signing. You will note that I am against adding from_base64 and to_base64, preferring an OO solution of overriding functionality.

I think you should pick the simplest of the things above and make separate PRs! Let's get some code merged.

@FabienChaynes
Copy link
Collaborator Author

FabienChaynes commented Feb 23, 2023

No worries, I'm all for constructive criticism 🙂

2a. and 2b. should become useless if 2c. is implemented. Since I started a PR for 2c., I won't tackle them.
About 2c. you're right, there's probably no legit usecase. I tend to be overly conservative with breaking changes because every change breaks someone's workflow 😁 But I guess that this one can be ok.

It'll indeed be better to split it into several PRs, time for a TODO list!

Siteproxy
@dblock
Copy link
Collaborator

dblock commented Feb 23, 2023

I tend to be overly conservative with breaking changes because every change breaks someone's workflow 😁 But I guess that this one can be ok.

We have semver for that, we can just increment the major version or pretend like it's a fix ;) If you feel conservative, do the former.

@dblock
Copy link
Collaborator

dblock commented Feb 23, 2023

I do like the idea that users may want to serialize cursors differently, so I wouldn't dismiss the need to subclass Cursor. If you aren't going to code it, open an issue referencing the various conversations we had here?

@FabienChaynes
Copy link
Collaborator Author

Closing this PR since it was split into other PRs/issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants