📅 2020-May-04 ⬩ ✍️ Ashwin Nanjappa ⬩ 🏷️ curl, wget ⬩ 📚 Archive
Sometimes colleagues share a URL to me that points to a file that I need to download. Since these links can sometimes point to huge files, I like to know what is the file size beforehand.
The file size information is in the header of the GET response from the web server when downloading a file. But how to get this information without downloading the file? Turns out you can use the HEAD command to get just the header information.
The --head
option of CURL can be used to get just the header, which should have the file size in the Content-Length
field.
Here is an example:
$ curl --head http://foobar.com/haha.zip
HTTP/1.1 200 OK
Date: Mon, 04 May 2020 17:27:00 GMT
Last-Modified: Thu, 30 Apr 2020 18:09:28 GMT
Content-Disposition: attachment; filename="haha.zip"; filename*=UTF-8''haha.zip
Content-Type: application/zip
Content-Length: 754816341
Note that you can also find the last modified date-time of the file using this method.
The --spider
option of WGET behaves similar to the above - just getting the header information without downloading the file contents. The file size is listed in the Length
field.
Here is an example:
$ wget --spider http://foobar.com/haha.zip
Spider mode enabled. Check if remote file exists.
--2020-05-04 10:26:39-- http://foobar.com/haha.zip
Resolving foobar.com (foobar.com)... 185.199.110.153
Connecting to foobar.com (foobar.com)|185.199.110.153|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 754816341 (720M) [application/zip]
Remote file exists.