Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Break and stop the pagination if we have the results #152

Merged
merged 3 commits into from
Feb 23, 2019

Conversation

Fokko
Copy link
Contributor

@Fokko Fokko commented Feb 21, 2019

Hi Googlers,

I would like to suggest the following optimization. Right now the while loop will continue to fetch results while we already have the max results. For example, will become very inefficient when you're looking if a directory exists, and it will fetch all the pages.

This caused our cluster to do 10th of thousands of requests per second, which became quite expensive.

Regards, Fokko

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

ℹ️ Googlers: Go here for more info.

@Fokko Fokko force-pushed the break-early branch 2 times, most recently from 81a40ab to 591e495 Compare February 21, 2019 17:00
@Fokko
Copy link
Contributor Author

Fokko commented Feb 21, 2019

I signed it!

@medb
Copy link
Contributor

medb commented Feb 21, 2019

Thank you for contribution.

I think that we mitigated spike in the list requests in newly released GCS connector 1.9.15 (see issue #151), but this will be worthwhile improvement anyway, please allow us some time to review it.

@Fokko
Copy link
Contributor Author

Fokko commented Feb 22, 2019

I signed it!

@googlebot
Copy link

CLAs look good, thanks!

ℹ️ Googlers: Go here for more info.

@Fokko
Copy link
Contributor Author

Fokko commented Feb 22, 2019

Thanks for the quick response. I'll try to fix the tests somewhere today.

Fokko and others added 3 commits February 23, 2019 09:21
Right now the while loop will continue to fetch results while we already
have the object. Which can be very expensive when you're looking if
a directory exists, and it will fetch all the pages.
@medb medb merged commit c26b8d7 into GoogleCloudDataproc:master Feb 23, 2019
@Fokko Fokko deleted the break-early branch February 23, 2019 19:12
@Fokko
Copy link
Contributor Author

Fokko commented Feb 23, 2019

Thanks @medb

mayanks pushed a commit to mayanks/hadoop-connectors that referenced this pull request Aug 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 participants