Google Drive #
This notebook covers how to load documents from Google Drive. Currently, only Google Docs are supported.
Prerequisites #
- Create a Google Cloud project or use an existing project
- Enable the Google Drive API (opens in a new tab)
- Authorize credentials for desktop app (opens in a new tab)
- `pip
install
--upgrade
google-api-python-client
google-auth-httplib2
google-auth-oauthlib`
🧑 Instructions for ingesting your Google Docs data #
By default, the
GoogleDriveLoader
expects the
credentials.json
file to be
~/.credentials/credentials.json
, but this is configurable using the
credentials_path
keyword argument. Same thing with
token.json
token_path
. Note that
token.json
will be created automatically the first time you use the loader.
GoogleDriveLoader
can load from a list of Google Docs document ids or a folder id. You can obtain your folder and document id from the URL:
- Folder: https://drive.google.com/drive/u/0/folders/1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5 (opens in a new tab) -> folder id is
"1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5" - Document: https://docs.google.com/document/d/1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw/edit (opens in a new tab) -> document id is
"1bfaMQ18_i56204VaQDVeAFpqEijJTgvurupdEDiaUQw"
from langchain.document_loaders import GoogleDriveLoader
loader = GoogleDriveLoader(
folder_id="1yucgL9WGgWZdM1TOuKkeghlPizuzMYb5",
# Optional: configure whether to recursively fetch files from subfolders. Defaults to False.
recursive=False
)
docs = loader.load()