Arthur

Pemberton

Full-stack web applications developer


Full page screenshots with Python and Selenium

December 18, 2017Arthur Pemberton9 Comments

Currently, none of the major Selenium drivers (browsers) support the ability to easily take a screenshot of an entire web page. The following function takes multiple screenshots through the viewport and scrolls between screenshots, then stitches the resulting images into a single PNG.

def save_fullpage_screenshot(driver, url, output_path, tmp_prefix='selenium_screenshot', tmp_suffix='.png'):
	"""
	Creates a full page screenshot using a selenium driver by scrolling and taking multiple screenshots,
	and stitching them into a single image.
	"""

	# get the page
	driver.get(url)

	# get dimensions
	window_height = driver.execute_script('return window.innerHeight')
	scroll_height = driver.execute_script('return document.body.parentNode.scrollHeight')
	num = int( math.ceil( float(scroll_height) / float(window_height) ) )

	# get temp files
	tempfiles = []
	for i in xrange( num ):
		fd,path = tempfile.mkstemp(prefix='{0}-{1:02}-'.format(tmp_prefix, i+1), suffix=tmp_suffix)
		os.close(fd)
		tempfiles.append(path)
		pass

	try:
		# take screenshots
		for i,path in enumerate(tempfiles):
			if i > 0:
				driver.execute_script( 'window.scrollBy(%d,%d)' % (0, window_height) )
			
			driver.save_screenshot(path)
			pass
		
		# stitch images together
		stiched = None
		for i,path in enumerate(tempfiles):
			img = Image.open(path)
			
			w, h = img.size
			y = i * window_height
			
			if i == ( len(tempfiles) - 1 ):
				img = img.crop((0, h-(scroll_height % h), w, h))
				w, h = img.size
				pass
			
			if stiched is None:
				stiched = Image.new('RGB', (w, scroll_height))
			
			stiched.paste(img, (
				0, # x0
				y, # y0
				w, # x1
				y + h # y1
			))
			pass
		stiched.save(output_path)
	finally:
		# cleanup
		for path in tempfiles:
			if os.path.isfile(path):
				os.remove(path)
		pass

	return output_path

The following libraries are required:

  • selenium
  • Pillow

Slightly updated version here.

This article has 9 comments
  1. Marcel
    May 2, 2018

    Thank you for this piece of code. It helped me a lot.
    After working with it for a while I think you should not do the crop if scroll_height % h is equal to zero. I’ve modified it as follows:

    if i == ( len(tempfiles) – 1 ):
    crop_height = scroll_height % h
    if crop_height > 0:
    img = img.crop((0, h-crop_height, w, h))
    w, h = img.size
    pass

    Maybe you can think about it if you come to the same conclusion.

    • Arthur Pemberton
      May 2, 2018

      I haven’t tried it out myself yet, but it looks good. I’ll update the post once I do.

      • Marcel
        May 2, 2018

        Maybe a little bit more information.

        If you take a screenshot of a very tiny website (where one screenshot is already enough) then scroll_height % h is always 0 and you get only a black image as result.

        I think on a big webpage the problem can also occur. I assume that the last part of the stiched image will be a black box. But it’s unlikely to run into this problem on a big website.

    • Arthur Pemberton
      December 26, 2018

      Sorry, finally spent some time on this. I replicated the issue you saw. To fix it, I changed `if i == ( tempfiles_len – 1 ):` to `if i == ( tempfiles_len – 1 ) and num > 1:`.

  2. gaurav saini
    June 26, 2018

    Hi,
    When I was trying the same code on Mac it crops the image from right and bottom. Same code is working fine on WIndows.

    Your help is really appreciated.

    Thanks

    • Arthur Pemberton
      July 4, 2018

      Sorry… I don’t really have any ideas. I worked with the code primarily on Linux.

  3. diego
    December 11, 2018

    Hello, please update it to python3.
    thanks!


Leave a Reply