Data Processing | Palomar Transient Factory

There are many steps to the processing of PTF data. We present here a synopsized discussion of data processing as an aid to data users. Please see Surace et al. 2014 for a description of the overall system architecture and Laher et al. 2014 for a detailed discussion of the software and archive.

Data Acquisition

Data is taken at Palomar under control of a scheduling robot. The robot autonomously decides which fields to observe, based on the required cadence for different PTF experiments. The camera opens and closes a shutter, and then reads independently and simultaneously each of the 12 PTF CCDs. The data acquisition system bundles the pixel data as a multi-extension fits file. It also attaches to the image headers basic information such as the camera telemetry during the exposure. It also includes information from the scheduling system and telescope control system (TCS). This includes the filter, exposure time, and where the telescope thinks it was pointing.

As each image is taken it is transmitted to the San Diego Supercomputing Center via a high-speed microwave data link. The data is ten copied to two locations: the NERSC facility at Lawrence Berkeley Laboratory and the Infrared Processing and Analysis Center at Caltech.

PTF/iPTF Realtime Transient Detection

During the PTF survey, the raw image was sent to the NERSC facility at Lawrence Berkeley Laboratory for transient detection. NERSC performed basic data calibration, PSF-matching, and image differencing. Source detection on the difference images created a transient candidate list, which was then subjected to both human and machine vetting. The LBL transient pipeline output is not part of the public data releases, and is not discussed further. A transient discovery pipeline was recently implemented (and is now operational) at IPAC to complement the NERSC pipeline. For details, see Masci et al. 2016. Transient public alerts are a planned component for the future ZTF survey.

Data Ingest

Files are transferred continuously throughout the night to IPAC, passing first via high-speed microwave link to the San Diego supercomputing center, then to the Cahill astrophysics building at Caltech. Upon arrival at IPAC, the data is unpackaged. The raw files are archived to spinning disk, as well as to a deep tape storage archive. Metadata about each of the files (essentially, the image headers) are stored in a database. The mutli-extension FITS files are then split into their individual CCD images.

High-Fidelity Processing

After all the data for each night is received, the IPAC high fidelity processing pipeline (also called the “frameproc” pipeline) initiates. Each CCD is treated wholly independently, which allows for a simple degree of quantization for parallel processing. This pipeline has several steps, briefly synopsized here:

Each incoming frame is checked for any of several error states. An initial blind astrometric solution is derived (via the astrometry.net package), which is used to check for pointing anomalies vs. commanded pointing.
All the frames for a given CCD are combined to create a “superbias” frame for that evening. This is subtracted from every frame, in addition to a floating bias level derived from the CCD overscan. The overscan regions are then trimmed.
The frames for a given CCD are adjusted to a common scale and combined to create a flatfield for that evening. This step includes an initial object detection stage for object masking during combination. Various levels of logic are used during this stage to ensure adequate dither diversity as to allow adequate object rejection and otherwise ensure the quality of the flatfield. This flatfield is ultimately divided into all the data.
Various metrics about the frames are registered in a data quality database. This includes measurements of the seeing, the background levels, etc.
An initial source extraction is then performed via “sextractor”.
This is then used as input to “scamp” to derive the pointing solution. A full distortion matrix is rederived for every frame, due in part by the need to account for atmospheric dispersion. The catalogs currently in use for deriving astrometry are the SDSS-DR10 and UCAC-4. No one set of parameters is able to solve the whole sky, so significant logic is used here to handle solution failures, with fallbacks to additional catalogs and algorithms. Astrometry checking has many stages. The primary check is to ensure that all sources in a given PTF frame and magnitude range have 2MASS counterparts. An additional check also examines the astrometric unit vectors throughout the image in order to reject images whose solutions impart twists that affect only subregions of the image. Typical errors for astrometry are 0.2-0.3 arcseconds.
In-house software is used to derive the photometric solution. All of the data that overlaps the SDSS is used to derive a fit for extinction, color, time, and position on the array. The last term is due to an issue an the early data caused by oil fogging on the dewar window, which is handled here as a constant (per night) delta on the zeropoint, akin to an illumination correction. Typical absolute calibration errors are 2-3%.
All of this information is written as metadata in the image headers. A mask file is also produced, which encodes information such as bad pixels in specific bitplanes in the image. A final source extractor catalog is also produced. This catalog contains individual sources seen in the images, with flux-calibrated photometry and astrometry. The original image header is also attached.

Intermediate Palomar Transient Factory

Intermediate

Palomar

Transient

Factory