GDAL Virtual File Systems (compressed, network hosted, etc...): /vsimem, /vsizip, /vsitar, /vsicurl, ... — GDAL documentation
/vsicurl/ (http/https/ftp files: random access)
/vsicurl/ is a file system handler that allows on-the-fly random reading of files available through HTTP/FTP web protocols, without prior download of the entire file. It requires GDAL to be built against libcurl.
Recognized filenames are of the form /vsicurl/http[s]://path/to/remote/resource or /vsicurl/ftp://path/to/remote/resource, where path/to/remote/resource is the URL of a remote resource.
Example using ogrinfo to read a shapefile on the internet:
ogrinfo -ro -al -so /vsicurl/https://raw.githubusercontent.com/OSGeo/gdal/master/autotest/ogr/data/poly.shp
Options can be passed in the filename with the following syntax: /vsicurl?[option_i=val_i&]*url=http://... where each option name and value (including the value of "url") is URL-encoded.
Currently supported options are:
use_head=yes/no: whether the HTTP HEAD request can be emitted. Default to YES. Setting this option overrides the behavior of the CPL_VSIL_CURL_USE_HEAD configuration option.
max_retry=number: default to 0. Setting this option overrides the behavior of the GDAL_HTTP_MAX_RETRY configuration option.
retry_delay=number_in_seconds: default to 30. Setting this option overrides the behavior of the GDAL_HTTP_RETRY_DELAY configuration option.
retry_codes=``ALL`` or comma-separated list of HTTP error codes. Setting this option overrides the behavior of the GDAL_HTTP_RETRY_CODES configuration option. (GDAL >= 3.10)
list_dir=yes/no: whether an attempt to read the file list of the directory where the file is located should be done. Default to YES.
empty_dir=yes/no: whether to disable directory listing and disable logic in drivers to probe for individual side-car files. Default to NO.
useragent=value: HTTP UserAgent header
referer=value: HTTP Referer header
cookie=value: HTTP Cookie header
header_file=value: Filename that contains one or several "Header: Value" lines
header.<key>=<value>: HTTP request header of name <key> and value <value>. (GDAL >= 3.11). e.g. header.Accept=application%2Fjson
unsafessl=yes/no
low_speed_time=value
low_speed_limit=value
proxy=value
proxyauth=value
proxyuserpwd=value
pc_url_signing=yes/no: whether to use the URL signing mechanism of Microsoft Planetary Computer (https://planetarycomputer.microsoft.com/docs/concepts/sas/). (GDAL >= 3.5.2). Note that starting with GDAL 3.9, this may also be set with the path-specific option ( cf VSISetPathSpecificOption()) VSICURL_PC_URL_SIGNING set to YES.
pc_collection=name: name of the collection of the dataset for Planetary Computer URL signing. Only used when pc_url_signing=yes. (GDAL >= 3.5.2)
Partial downloads (requires the HTTP server to support random reading) are done with a 16 KB granularity by default. The chunk size can be configured with the CPL_VSIL_CURL_CHUNK_SIZE configuration option, with a value in bytes. If the driver detects sequential reading, it will progressively increase the chunk size up to 128 times CPL_VSIL_CURL_CHUNK_SIZE (so 2 MB by default) to improve download performance.
In addition, a global least-recently-used cache of 16 MB shared among all downloaded content is used, and content in it may be reused after a file handle has been closed and reopen, during the life-time of the process or until VSICurlClearCache() is called. The size of this global LRU cache can be modified by setting the configuration option CPL_VSIL_CURL_CACHE_SIZE (in bytes).
When increasing the value of CPL_VSIL_CURL_CHUNK_SIZE to optimize sequential reading, it is recommended to increase CPL_VSIL_CURL_CACHE_SIZE as well to 128 times the value of CPL_VSIL_CURL_CHUNK_SIZE.
The GDAL_INGESTED_BYTES_AT_OPEN configuration option can be set to impose the number of bytes read in one GET call at file opening (can help performance to read Cloud optimized geotiff with a large header).
The GDAL_HTTP_PROXY (for both HTTP and HTTPS protocols), GDAL_HTTPS_PROXY (for HTTPS protocol only), GDAL_HTTP_PROXYUSERPWD and GDAL_PROXY_AUTH configuration options can be used to define a proxy server. The syntax to use is the one of Curl CURLOPT_PROXY, CURLOPT_PROXYUSERPWD and CURLOPT_PROXYAUTH options.
The CURL_CA_BUNDLE or SSL_CERT_FILE configuration options can be used to set the path to the Certification Authority (CA) bundle file (if not specified, curl will use a file in a system location).
Additional HTTP headers can be sent by setting the GDAL_HTTP_HEADER_FILE configuration option to point to a filename of a text file with "key: value" HTTP headers.
As an alternative, starting with GDAL 3.6, the GDAL_HTTP_HEADERS configuration option can also be used to specify headers. CPL_CURL_VERBOSE=YES allows one to see them and more, when combined with --debug.
Starting with GDAL 3.10, the Authorization header is no longer automatically forwarded when redirections are followed. That behavior can be configured by setting the CPL_VSIL_CURL_AUTHORIZATION_HEADER_ALLOWED_IF_REDIRECT configuration option.
Starting with GDAL 3.11, a query string can be appended to a given /vsicurl/ filename by taking its value from the VSICURL_QUERY_STRING path-specific option set with VSISetPathSpecificOption(). This can for example be used when managing Shared Access Signatures (SAS) on application side, and not wanting to include the signature as part of the filename propagated through GDAL.
The GDAL_HTTP_MAX_RETRY (number of attempts) and GDAL_HTTP_RETRY_DELAY (in seconds) configuration option can be set, so that request retries are done in case of HTTP errors 429, 502, 503 or 504.
Starting with GDAL 3.6, the following configuration options control the TCP keep-alive functionality (cf https://daniel.haxx.se/blog/2020/02/10/curl-ootw-keepalive-time/ for a detailed explanation):
GDAL_HTTP_TCP_KEEPALIVE = YES/NO. whether to enable TCP keep-alive. Defaults to NO
GDAL_HTTP_TCP_KEEPIDLE = integer, in seconds. Keep-alive idle time. Defaults to 60. Only taken into account if GDAL_HTTP_TCP_KEEPALIVE=YES.
GDAL_HTTP_TCP_KEEPINTVL = integer, in seconds. Interval time between keep-alive probes. Defaults to 60. Only taken into account if GDAL_HTTP_TCP_KEEPALIVE=YES.
Starting with GDAL 3.7, the following configuration options control support for SSL client certificates:
GDAL_HTTP_SSLCERT = filename. Filename of the the SSL client certificate. Cf https://curl.se/libcurl/c/CURLOPT_SSLCERT.html
GDAL_HTTP_SSLCERTTYPE = string. Format of the SSL certificate: "PEM" or "DER". Cf https://curl.se/libcurl/c/CURLOPT_SSLCERTTYPE.html
GDAL_HTTP_SSLKEY = filename. Private key file for TLS and SSL client certificate. Cf https://curl.se/libcurl/c/CURLOPT_SSLKEY.html
GDAL_HTTP_KEYPASSWD = string. Passphrase to private key. Cf https://curl.se/libcurl/c/CURLOPT_KEYPASSWD.html
More generally options of CPLHTTPFetch() available through configuration options are available. Starting with GDAL 3.7, the above configuration options can also be specified as path-specific options with VSISetPathSpecificOption().
Starting with GDAL 3.11, the following configuration options control the number of HTTP connections:
GDAL_HTTP_MAX_CACHED_CONNECTIONS = integer_number. Maximum amount of connections that libcurl may keep alive in its connection cache after use. Cf https://curl.se/libcurl/c/CURLMOPT_MAXCONNECTS.html
GDAL_HTTP_MAX_TOTAL_CONNECTIONS = integer_number. Maximum number of simultaneously open connections in total. Cf https://curl.se/libcurl/c/CURLMOPT_MAX_TOTAL_CONNECTIONS.html
The file can be cached in RAM by setting the configuration option VSI_CACHE to TRUE. The cache size defaults to 25 MB, but can be modified by setting the configuration option VSI_CACHE_SIZE (in bytes). Content in that cache is discarded when the file handle is closed.
The CPL_VSIL_CURL_NON_CACHED configuration option can be set to values like /vsicurl/http://example.com/foo.tif:/vsicurl/http://example.com/some_directory, so that at file handle closing, all cached content related to the mentioned file(s) is no longer cached. This can help when dealing with resources that can be modified during execution of GDAL related code. Alternatively, VSICurlClearCache() can be used.
/vsicurl/ will try to query directly redirected URLs to Amazon S3 signed URLs during their validity period, so as to minimize round-trips. This behavior can be disabled by setting the configuration option CPL_VSIL_CURL_USE_S3_REDIRECT to NO.
Starting with GDAL 3.12, the GDAL_HTTP_PATH_VERBATIM configuration option can be set to YES so that sequences of /../ or /./ that may exist in the URL's path part are kept unchanged. Otherwise, by default, they are squashed, according to RFC 3986 section 5.2.4.
VSIStatL() will return the size in st_size member and file nature- file or directory - in st_mode member (the later only reliable with FTP resources for now).
VSIReadDir() should be able to parse the HTML directory listing returned by the most popular web servers, such as Apache and Microsoft IIS.
/vsicurl_streaming/ (http/https/ftp files: streaming)
/vsicurl_streaming/ is a file system handler that allows on-the-fly sequential reading of files streamed through HTTP/FTP web protocols, without prior download of the entire file. It requires GDAL to be built against libcurl.
Although this file handler is able seek to random offsets in the file, this will not be efficient. If you need efficient random access and that the server supports range downloading, you should use the /vsicurl/ file system handler instead.
Recognized filenames are of the form /vsicurl_streaming/http[s]://path/to/remote/resource or /vsicurl_streaming/ftp://path/to/remote/resource, where path/to/remote/resource is the URL of a remote resource.
The GDAL_HTTP_PROXY (for both HTTP and HTTPS protocols), GDAL_HTTPS_PROXY (for HTTPS protocol only), GDAL_HTTP_PROXYUSERPWD and GDAL_PROXY_AUTH configuration options can be used to define a proxy server. The syntax to use is the one of Curl CURLOPT_PROXY, CURLOPT_PROXYUSERPWD and CURLOPT_PROXYAUTH options.
The CURL_CA_BUNDLE or SSL_CERT_FILE configuration options can be used to set the path to the Certification Authority (CA) bundle file (if not specified, curl will use a file in a system location).
The file can be cached in RAM by setting the configuration option VSI_CACHE to TRUE. The cache size defaults to 25 MB, but can be modified by setting the configuration option VSI_CACHE_SIZE (in bytes).
VSIStatL() will return the size in st_size member and file nature- file or directory - in st_mode member (the later only reliable with FTP resources for now).
/vsiaz/ (Microsoft Azure Blob files)
/vsiaz/ is a file system handler that allows on-the-fly random reading of (primarily non-public) files available in Microsoft Azure Blob containers, without prior download of the entire file. It requires GDAL to be built against libcurl.
See /vsiadls/ for a related filesystem for Azure Data Lake Storage Gen2.
It also allows sequential writing of files. No seeks or read operations are then allowed, so in particular direct writing of GeoTIFF files with the GTiff driver is not supported, unless, if, starting with GDAL 3.2, the CPL_VSIL_USE_TEMP_FILE_FOR_RANDOM_WRITE configuration option is set to YES, in which case random-write access is possible (involves the creation of a temporary local file, whose location is controlled by the CPL_TMPDIR configuration option). A block blob will be created if the file size is below 4 MB. Beyond, an append blob will be created (with a maximum file size of 195 GB).
Deletion of files with VSIUnlink(), creation of directories with VSIMkdir() and deletion of (empty) directories with VSIRmdir() are also possible. Note: when using VSIMkdir(), a special hidden .gdal_marker_for_dir empty file is created, since Azure Blob does not natively support empty directories. If that file is the last one remaining in a directory, VSIRmdir() will automatically remove it. This file will not be seen with VSIReadDir(). If removing files from directories not created with VSIMkdir(), when the last file is deleted, its directory is automatically removed by Azure, so the sequence VSIUnlink("/vsiaz/container/subdir/lastfile") followed by VSIRmdir("/vsiaz/container/subdir") will fail on the VSIRmdir() invocation.
Recognized filenames are of the form /vsiaz/container/key, where container is the name of the container and key is the object "key", i.e. a filename potentially containing subdirectories.
The generalities of /vsicurl/ apply.
The following configuration options are specific to the /vsiaz/ handler:
AZURE_NO_SIGN_REQUEST=[YES/NO]: (GDAL >= 3.2) Controls whether requests are signed.
AZURE_STORAGE_CONNECTION_STRING=value: Credential string provided in the Access Key section of the administrative interface, containing both the account name and a secret key.
AZURE_STORAGE_ACCESS_TOKEN=value: (GDAL >= 3.5) Access token typically obtained using Microsoft Authentication Library (MSAL).
AZURE_STORAGE_ACCOUNT=value: Specifies storage account name.
AZURE_STORAGE_ACCESS_KEY=value: Specifies secret key associated with AZURE_STORAGE_ACCOUNT.
AZURE_STORAGE_SAS_TOKEN=value: (GDAL >= 3.2) Shared Access Signature.
AZURE_IMDS_OBJECT_ID=value: (GDAL >= 3.8) object_id of the managed identity you would like the token for, when using Azure Instance Metadata Service (IMDS) authentication in a Azure Virtual Machine. Required if your VM has multiple user-assigned managed identities. This option may be set as a path-specific option with VSISetPathSpecificOption()
AZURE_IMDS_CLIENT_ID=value: (GDAL >= 3.8) client_id of the managed identity you would like the token for, when using Azure Instance Metadata Service (IMDS) authentication in a Azure Virtual Machine. Required if your VM has multiple user-assigned managed identities. This option may be set as a path-specific option with VSISetPathSpecificOption()
AZURE_IMDS_MSI_RES_ID=value: (GDAL >= 3.8) msi_res_id (Azure Resource ID) of the managed identity you would like the token for, when using Azure Instance Metadata Service (IMDS) authentication in a Azure Virtual Machine. Required if your VM has multiple user-assigned managed identities. This option may be set as a path-specific option with VSISetPathSpecificOption()
Several authentication methods are possible, and are attempted in the following order:
The AZURE_STORAGE_CONNECTION_STRING configuration option
The AZURE_STORAGE_ACCOUNT configuration option is set to specify the account name AND
(GDAL >= 3.5) The AZURE_STORAGE_ACCESS_TOKEN configuration option is set to specify the access token, that will be included in a "Authorization: Bearer ${AZURE_STORAGE_ACCESS_TOKEN}" header. This access token is typically obtained using Microsoft Authentication Library (MSAL).
The AZURE_STORAGE_ACCESS_KEY configuration option is set to specify the secret key.
The AZURE_NO_SIGN_REQUEST=YES configuration option is set, so as to disable any request signing. This option might be used for accounts with public access rights. Available since GDAL 3.2
The AZURE_STORAGE_SAS_TOKEN configuration option (AZURE_SAS if GDAL < 3.5) is set to specify a Shared Access Signature. This SAS is appended to URLs built by the /vsiaz/ file system handler. Its value should already be URL-encoded and should not contain any leading '?' or '&' character (e.g. a valid one may look like "st=2019-07-18T03%3A53%3A22Z&se=2035-07-19T03%3A53%3A00Z&sp=rl&sv=2018-03-28&sr=c&sig=2RIXmLbLbiagYnUd49rgx2kOXKyILrJOgafmkODhRAQ%3D"). Available since GDAL 3.2
The current machine is a Azure Virtual Machine with Azure Ac