# High-level overview of Save-Page-As code

This document describes code under `//content/browser/downloads`
restricting the scope only to code handling Save-Page-As functionality
(i.e. leaving out other downloads-related code).
This document focuses on high-level overview and aspects of the code that
span multiple compilation units (hoping that individual compilation units
are described by their code comments or by their code structure).

## Classes overview

* SavePackage class
    * coordinates overall save-page-as request
    * created and owned by `WebContents`
      (ref-counted today, but it is unnecessary - see https://crbug.com/596953)
    * UI-thread object

* SaveFileCreateInfo::SaveFileSource enum
    * classifies `SaveItem` and `SaveFile` processing into 2 flavours:
        * `SAVE_FILE_FROM_NET` (see `SaveFileResourceHandler`)
        * `SAVE_FILE_FROM_DOM` (see "Complete HTML" section below)

* SaveItem class
    * tracks saving a single file
    * created and owned by `SavePackage`
    * UI-thread object

* SaveFileManager class
    * coordinates between FILE and UI threads
        * Gets requests from `SavePackage` and communicates results back to
          `SavePackage` on the UI thread.
        * Shephards data (received from the network OR from DOM) into
          FILE thread - via `SaveFileManager::UpdateSaveProgress`
    * created and owned by `BrowserMainLoop`
      (ref-counted today, but it is unnecessary - see https://crbug.com/596953)
    * The global instance can be retrieved by the Get method.

* SaveFile class
    * tracks saving a single file
    * created and owned by `SaveFileManager`
    * FILE-thread object

* SaveFileResourceHandler class
    * tracks network downloads + forwards their status into `SaveFileManager`
      (onto FILE-thread)
    * created by `ResourceDispatcherHostImpl::BeginSaveFile`
    * IO-thread object

* SaveFileCreateInfo POD struct
    * short-lived object holding data passed to callbacks handling start of
      saving a file.

* MHTMLGenerationManager class
    * singleton that manages progress of jobs responsible for saving individual
      MHTML files (represented by `MHTMLGenerationManager::Job`).


## Overview of the processing flow

Save-Page-As flow starts with `WebContents::OnSavePage`.
The flow is different depending on the save format chosen by the user
(each flow is described in a separate section below).

### Complete HTML

Very high-level flow of saving a page as "Complete HTML":

* Step 1: `SavePackage` asks all frames for "savable resources"
          and creates `SaveItem` for each of files that need to be saved

* Step 2: `SavePackage` first processes `SAVE_FILE_FROM_NET`
          `SaveItem`s and asks `SaveFileManager` to save
          them.

* Step 3: `SavePackage` handles remaining `SAVE_FILE_FROM_DOM` `SaveItem`s and
          asks each frame to serialize its DOM/HTML (each frame gets from
          `SavePackage` a map covering local paths that need to be referenced by
          the frame).  Responses from frames get forwarded to `SaveFileManager`
          to be written to disk.


### MHTML

Very high-level flow of saving a page as MHTML:

* Step 1: `WebContents::GenerateMHTML` is called by either `SavePackage` (for
          Save-Page-As UI) or Extensions (via `chrome.pageCapture` extensions
          API) or by an embedder of `WebContents` (since this is public API of
          //content).

* Step 2: `MHTMLGenerationManager` creates a new instance of
          `MHTMLGenerationManager::Job` that coordinates generation of
          the MHTML file by sequentially (one-at-a-time) asking each
          frame to write its portion of MHTML to a file handle.  Other
          classes (i.e. `SavePackage` and/or `SaveFileManager`) are not
          used at this step at all.

* Step 3: When done `MHTMLGenerationManager` destroys
          `MHTMLGenerationManager::Job` instance and calls a completion
          callback which in case of Save-Page-As will end up in
          `SavePackage::OnMHTMLGenerated`.

Note: MHTML format is by default disabled in Save-Page-As UI on Windows, MacOS
and Linux (it is the default on ChromeOS), but for testing this can be easily
changed using `--save-page-as-mhtml` command line switch.


### HTML Only

Very high-level flow of saving a page as "HTML Only":

* `SavePackage` creates only a single `SaveItem` (always `SAVE_FILE_FROM_NET`)
  and asks `SaveFileManager` to process it
  (as in the Complete HTML individual SaveItem handling above.).


## Other relevant code

Pointers to related code outside of `//content/browser/download`:

* End-to-end tests:
    * `//chrome/browser/downloads/save_page_browsertest.cc`
    * `//chrome/test/data/save_page/...`

* Other tests:
    * `//content/browser/downloads/*test*.cc`
    * `//content/renderer/dom_serializer_browsertest.cc` - single process... :-/

* Elsewhere in `//content`:
    * `//content/renderer/savable_resources...`

* Blink:
    * `//third_party/WebKit/public/web/WebFrameSerializer...`
    * `//third_party/WebKit/Source/web/WebFrameSerializerImpl...`
      (used for Complete HTML today;  should use `FrameSerializer` instead in
      the long-term - see https://crbug.com/328354).
    * `//third_party/WebKit/Source/core/frame/FrameSerializer...`
      (used for MHTML today)
    * `//third_party/WebKit/Source/platform/mhtml/MHTMLArchive...`

