Data Elements Reference
SiteData collects 85+ data elements to provide comprehensive analytics. This reference documents every data point, why it's collected, and how it's used across different analytics features.
Overview
Data elements are organized into 13 categories:
- Core Essential identifiers for tracking sessions and events
- Page Information about the pages visitors view
- UTM Marketing attribution and campaign tracking
- Client Browser, OS, and device information
- Screen Display and viewport dimensions
- Hardware Device hardware capabilities
- Capabilities Browser features and settings
- Network Connection type and speed
- Context Session and visit context
- Performance Page load and Web Vitals metrics
- Geography Location data from CloudFront
- Identity Anonymous identifiers
- Security Threat detection data
Core Core Identifiers
8 Elements
Essential identifiers that link events together and enable session tracking, visitor identification, and event categorization.
| Element | Description | Used In |
|---|---|---|
event_id |
Unique identifier for each tracked event (UUID). Ensures event deduplication and enables detailed event auditing. | Events |
site_id |
Unique identifier for the website being tracked. Links all events to their parent website for multi-site analytics. | All Features |
tracking_id |
Public tracking ID used in the SDK (different from site_id for security). Validates that events come from authorized sources. | Validation |
event_type |
Type of event: pageview, click, scroll, form, custom, unload. Determines how events are processed and displayed. |
EventsHeatmapsFunnels |
visitor_id |
Anonymous visitor identifier stored in localStorage. Persists across sessions to track returning visitors without cookies. | OverviewUser FlowPredictive |
session_id |
Unique session identifier. A new session starts after 30 minutes of inactivity or at midnight. Groups related pageviews together. | User FlowFunnels |
timestamp |
Server-side UTC timestamp when the event was received. Used for time-series analysis and data ordering. | All Features |
client_timestamp |
Client-side timestamp (Unix ms) when the event occurred. Accounts for network latency and enables accurate session timing. | PerformanceEvents |
Page Page Information
8 Elements
Details about the pages visitors view, including URLs, titles, and referrer information. Powers top pages reports and navigation analysis.
| Element | Description | Used In |
|---|---|---|
page_url |
Full URL of the page (protocol, domain, path, query). Used to identify unique pages and track page-level metrics. | OverviewUser FlowHeatmaps |
page_path |
URL path without domain or query string (e.g., /products/shoes). Normalizes URLs for aggregation across domains. |
Top PagesFunnels |
page_hash |
URL fragment/anchor (e.g., #section-2). Tracks in-page navigation for single-page apps and anchor links. |
User FlowEvents |
page_query |
Query string parameters (e.g., ?id=123&sort=price). Used for UTM extraction and filtering dynamic pages. |
UTMSecurity |
page_title |
HTML document title (<title> tag). Provides human-readable page names in reports. |
Top PagesUser Flow |
page_referrer |
Previous page URL (document.referrer). Tracks where visitors came from, powers referrer reports and entry page analysis. | OverviewUser Flow |
page_charset |
Document character encoding (e.g., UTF-8). Helps diagnose encoding issues affecting analytics. |
Diagnostics |
page_compat_mode |
Document rendering mode (CSS1Compat or BackCompat). Indicates standards vs. quirks mode. |
Diagnostics |
UTM UTM & Attribution
15 Elements
Marketing attribution parameters from URL query strings. Track campaign performance across all major advertising platforms.
| Element | Description | Used In |
|---|---|---|
utm_source |
Traffic source identifier (e.g., google, newsletter, facebook). Primary dimension for campaign analysis. |
OverviewCampaigns |
utm_medium |
Marketing medium (e.g., cpc, email, social). Groups traffic by marketing channel type. |
OverviewCampaigns |
utm_campaign |
Campaign name (e.g., spring_sale, product_launch). Tracks specific marketing initiatives. |
Campaigns |
utm_term |
Paid search keywords. Tracks which search terms drove the visit. | Campaigns |
utm_content |
Ad content or A/B test variant. Differentiates similar ads or links within a campaign. | Campaigns |
utm_id |
Campaign ID for CRM integration. Links web analytics to external marketing systems. | Campaigns |
gclid |
Google Ads click identifier. Enables Google Ads conversion tracking and attribution. | Google Ads |
fbclid |
Facebook click identifier. Tracks visits from Facebook ads and posts. | Facebook Ads |
msclkid |
Microsoft Advertising click identifier. Tracks Bing Ads attribution. | Microsoft Ads |
dclid |
DoubleClick click identifier. Tracks Display & Video 360 campaigns. | Display Ads |
twclid |
Twitter/X click identifier. Tracks visits from Twitter ads. | Twitter Ads |
ttclid |
TikTok click identifier. Tracks TikTok advertising attribution. | TikTok Ads |
li_fat_id |
LinkedIn first-party ad tracking ID. Tracks LinkedIn advertising campaigns. | LinkedIn Ads |
wbraid / gbraid |
Google web-to-app and app-to-web tracking. Tracks cross-platform Google campaigns. | Google Ads |
ref / affiliate / partner |
Generic referral, affiliate, and partner tracking parameters. Supports custom attribution programs. | Affiliates |
Client Client Environment
6 Elements
Browser, operating system, and device type information. Powers device segmentation and compatibility analysis.
| Element | Description | Used In |
|---|---|---|
client_device |
Device category: desktop, mobile, or tablet. Determined from user agent and screen size. Essential for responsive design analysis. |
OverviewDevices |
client_browser |
Browser name and version (e.g., Chrome 120, Safari 17). Tracks browser market share and compatibility. |
OverviewDevices |
client_os |
Operating system (e.g., Windows 11, macOS, iOS 17, Android 14). Informs platform-specific optimizations. |
OverviewDevices |
client_language |
Primary browser language (e.g., en-US, es-MX). Used for localization analysis and content recommendations. |
Geography |
client_languages |
All accepted languages in preference order (comma-separated). Shows multilingual audience segments. | Geography |
client_timezone |
IANA timezone identifier (e.g., America/New_York). Enables time-of-day analysis in visitor's local time. |
GeographyPredictive |
client_timezone_offset |
UTC offset in minutes (e.g., -300 for EST). Used for timezone grouping and scheduling optimization. |
Predictive |
Screen Screen & Display
10 Elements
Display characteristics including viewport, screen dimensions, and pixel density. Critical for responsive design and heatmap accuracy.
| Element | Description | Used In |
|---|---|---|
viewport_width |
Browser viewport width in CSS pixels. The actual visible content area, excluding scrollbars. | HeatmapsDevices |
viewport_height |
Browser viewport height in CSS pixels. Used to calculate scroll depth and fold position. | HeatmapsScroll |
screen_width |
Total screen width in pixels. Physical display resolution (may differ from viewport). | Devices |
screen_height |
Total screen height in pixels. Full display height including system UI. | Devices |
screen_avail_width |
Available screen width (excluding taskbar/dock). Usable space for browser windows. | Devices |
screen_avail_height |
Available screen height (excluding taskbar/dock). Helps understand actual browsing context. | Devices |
screen_color_depth |
Color depth in bits (typically 24 or 32). Indicates display color capabilities. | Devices |
screen_pixel_depth |
Pixel depth in bits. Usually matches color depth on modern displays. | Devices |
screen_pixel_ratio |
Device pixel ratio (DPR). Values >1 indicate HiDPI/Retina displays. Important for image optimization. | DevicesPerformance |
screen_orientation |
Screen orientation: portrait-primary, landscape-primary, etc. Tracks mobile device usage patterns. |
DevicesHeatmaps |
Hardware Hardware
4 Elements
Device hardware capabilities that affect performance and interaction patterns.
| Element | Description | Used In |
|---|---|---|
hw_cpu_cores |
Number of logical CPU cores (navigator.hardwareConcurrency). Indicates device processing power for performance optimization. | PerformanceDevices |
hw_device_memory |
Approximate device RAM in GB (navigator.deviceMemory). Helps segment high vs. low-end devices. | PerformanceDevices |
hw_touch_enabled |
Whether touch input is available. Distinguishes touch-capable devices from mouse-only. | DevicesHeatmaps |
hw_max_touch_points |
Maximum simultaneous touch points supported. Indicates multi-touch capability for gesture support. | Devices |
Capabilities Browser Capabilities
7 Elements
Browser feature detection for compatibility analysis and user experience optimization.
| Element | Description | Used In |
|---|---|---|
cap_cookies_enabled |
Whether cookies are enabled. SiteData works without cookies, but this helps understand visitor privacy settings. | Devices |
cap_do_not_track |
Whether Do Not Track header is set. Respecting DNT is configurable in SiteData settings. | Privacy |
cap_pdf_viewer |
Whether browser has built-in PDF viewer. Affects document download vs. view behavior. | Devices |
cap_webgl |
Whether WebGL is supported. Indicates 3D graphics capability for rich content. | Devices |
cap_local_storage |
Whether localStorage is available. Required for visitor ID persistence. | Diagnostics |
cap_session_storage |
Whether sessionStorage is available. Used for session-scoped data. | Diagnostics |
cap_ad_blocker |
Whether an ad blocker is detected. Helps understand tracking script blocking rates. | DevicesDiagnostics |
Network Network
4 Elements
Network connection information from the Network Information API. Enables performance optimization for different connection types.
| Element | Description | Used In |
|---|---|---|
net_connection_type |
Effective connection type: 4g, 3g, 2g, or slow-2g. Indicates network speed category. |
PerformanceDevices |
net_downlink |
Estimated download bandwidth in Mbps. Actual network throughput measurement. | Performance |
net_rtt |
Estimated round-trip time in milliseconds. Network latency indicator. | Performance |
net_save_data |
Whether data saver mode is enabled. Signals user preference for reduced data usage. | Performance |
Context Session Context
5 Elements
Contextual information about the visitor's session and browsing history on your site.
| Element | Description | Used In |
|---|---|---|
ctx_page_view_num |
Page view number within the current session (1, 2, 3...). Tracks session depth and engagement. | User FlowFunnels |
ctx_is_new_visitor |
Whether this is the visitor's first ever visit. Distinguishes new vs. returning visitors. | OverviewPredictive |
ctx_visit_count |
Total number of visits by this visitor. Measures visitor loyalty and return rate. | User FlowPredictive |
ctx_entry_page |
First page viewed in the current session. Tracks landing page performance. | User FlowFunnels |
ctx_is_entry |
Whether this pageview is a session entry point. Used to count unique sessions per page. | User Flow |
Performance Performance Metrics
17 Elements
Page load timing and Core Web Vitals from the Performance API. Essential for performance monitoring and optimization.
Page Load Timing
| Element | Description | Used In |
|---|---|---|
perf_dns |
DNS lookup time in milliseconds. Time to resolve domain name to IP address. | Performance |
perf_tcp |
TCP connection time in milliseconds. Time to establish connection to server. | Performance |
perf_ttfb |
Time to First Byte in milliseconds. Server response time, key server-side metric. | Performance |
perf_download |
Content download time in milliseconds. Time to receive the HTML document. | Performance |
perf_dom_ready |
DOMContentLoaded time in milliseconds. When HTML is fully parsed and DOM is ready. | Performance |
perf_load |
Full page load time in milliseconds. When all resources (images, scripts) are loaded. | Performance |
perf_dom_interactive |
DOM interactive time in milliseconds. When page becomes interactive. | Performance |
perf_nav_type |
Navigation type: navigate, reload, back_forward, prerender. Indicates how user arrived. |
PerformanceUser Flow |
perf_redirect_count |
Number of redirects before reaching the page. Excessive redirects slow load times. | Performance |
perf_transfer_size |
Total bytes transferred for the page. Includes headers and compression overhead. | Performance |
perf_encoded_body_size |
Compressed payload size in bytes. Size of response body as received. | Performance |
perf_decoded_body_size |
Uncompressed payload size in bytes. Actual content size after decompression. | Performance |
Core Web Vitals
| Element | Description | Used In |
|---|---|---|
perf_lcp |
Largest Contentful Paint in milliseconds. Time until largest content element is visible. Good: <2500ms. | PerformanceSEO |
perf_fid |
First Input Delay in milliseconds. Time from first interaction to browser response. Good: <100ms. | PerformanceSEO |
perf_cls |
Cumulative Layout Shift score (0-1+). Measures visual stability. Good: <0.1. | PerformanceSEO |
perf_fcp |
First Contentful Paint in milliseconds. Time until first content is rendered. Good: <1800ms. | PerformanceSEO |
Geography Geography
10 Elements
Location data from CloudFront edge servers. No IP addresses are stored—geography is determined at the edge and only the location is recorded.
| Element | Description | Used In |
|---|---|---|
geo_country_code |
ISO 3166-1 alpha-2 country code (e.g., US, GB, DE). Always available, 99%+ accuracy. |
GeographyOverview |
geo_country_name |
Full country name (e.g., United States, Germany). Human-readable country identifier. |
Geography |
geo_region_code |
State/province code (e.g., TX, CA, ON). ISO 3166-2 subdivision code. |
Geography |
geo_region_name |
Full state/province name (e.g., Texas, Ontario). Human-readable region name. |
Geography |
geo_city |
City name (e.g., Austin, Toronto). Available for ~80% of traffic. |
Geography |
geo_postal_code |
Postal/ZIP code (e.g., 78701). Available for some regions, useful for local targeting. |
Geography |
geo_timezone |
IANA timezone from CloudFront (e.g., America/Chicago). Server-side timezone determination. |
Geography |
geo_latitude |
Approximate latitude coordinate. City-level precision, not exact location. | Geography |
geo_longitude |
Approximate longitude coordinate. City-level precision, not exact location. | Geography |
geo_metro_code |
Nielsen DMA code (US only). Designated Market Area for TV/media targeting. | Geography |
Identity Identity & Privacy
3 Elements
Anonymous identifiers that enable tracking without collecting personal information.
| Element | Description | Used In |
|---|---|---|
ip_hash |
SHA-256 hash of the partial IP address (first 3 octets for IPv4). Original IP is never stored. Used only for session continuity. | Session |
user_agent |
Browser User-Agent string. Contains browser, OS, and device info. Used for device detection and bot filtering. | DevicesSecurity |
data_json |
Custom event data as JSON string. Stores any additional data you send with custom events (e.g., {"productId": "123"}). |
EventsFunnels |
Security Security Analysis
Derived from collected data
Security threat detection uses existing data elements to identify OWASP vulnerability probing and suspicious activity. No additional data is collected.
| Analysis Type | Source Elements | Detection Purpose |
|---|---|---|
| SQL Injection | page_query, page_url |
Detects SQL injection attempts in query strings (e.g., UNION SELECT, OR 1=1) |
| XSS (Cross-Site Scripting) | page_query, page_url |
Identifies script injection attempts (e.g., <script>, javascript:) |
| Path Traversal | page_path, page_query |
Detects directory traversal (e.g., ../../../etc/passwd) |
| Command Injection | page_query |
Identifies shell command injection (e.g., ; cat /etc/passwd) |
| LFI/RFI | page_query |
Local/Remote File Inclusion attempts (e.g., php://filter) |
| SSRF | page_query |
Server-Side Request Forgery (e.g., url=http://169.254...) |
| Log4Shell | page_query, user_agent |
Log4j exploitation attempts (e.g., ${jndi:ldap://) |
| Scanner Detection | user_agent, page_path |
Identifies vulnerability scanners (Nmap, Nikto, SQLMap, etc.) |
| Admin Probing | page_path |
Unauthorized admin area access attempts (/wp-admin, /phpmyadmin) |
Data Retention
SiteData stores your analytics data based on your subscription plan:
- Basic Plan: 30 days of detailed data, 12 months of aggregates
- Pro Plan: 90 days of detailed data, 24 months of aggregates
- Enterprise Plan: 365 days of detailed data, unlimited aggregates
Raw event data is stored in compressed Parquet format for efficient querying. You can export your data at any time from the dashboard.