Data Elements Reference

SiteData collects 85+ data elements to provide comprehensive analytics. This reference documents every data point, why it's collected, and how it's used across different analytics features.

Privacy First: SiteData is designed with privacy in mind. We don't use cookies for tracking, don't collect personal information, and hash IP addresses before storage. All data collection is GDPR-compliant.

Overview

Data elements are organized into 13 categories:

  • Core Essential identifiers for tracking sessions and events
  • Page Information about the pages visitors view
  • UTM Marketing attribution and campaign tracking
  • Client Browser, OS, and device information
  • Screen Display and viewport dimensions
  • Hardware Device hardware capabilities
  • Capabilities Browser features and settings
  • Network Connection type and speed
  • Context Session and visit context
  • Performance Page load and Web Vitals metrics
  • Geography Location data from CloudFront
  • Identity Anonymous identifiers
  • Security Threat detection data

Core Core Identifiers

8 Elements

Essential identifiers that link events together and enable session tracking, visitor identification, and event categorization.

Element Description Used In
event_id Unique identifier for each tracked event (UUID). Ensures event deduplication and enables detailed event auditing. Events
site_id Unique identifier for the website being tracked. Links all events to their parent website for multi-site analytics. All Features
tracking_id Public tracking ID used in the SDK (different from site_id for security). Validates that events come from authorized sources. Validation
event_type Type of event: pageview, click, scroll, form, custom, unload. Determines how events are processed and displayed. EventsHeatmapsFunnels
visitor_id Anonymous visitor identifier stored in localStorage. Persists across sessions to track returning visitors without cookies. OverviewUser FlowPredictive
session_id Unique session identifier. A new session starts after 30 minutes of inactivity or at midnight. Groups related pageviews together. User FlowFunnels
timestamp Server-side UTC timestamp when the event was received. Used for time-series analysis and data ordering. All Features
client_timestamp Client-side timestamp (Unix ms) when the event occurred. Accounts for network latency and enables accurate session timing. PerformanceEvents

Page Page Information

8 Elements

Details about the pages visitors view, including URLs, titles, and referrer information. Powers top pages reports and navigation analysis.

Element Description Used In
page_url Full URL of the page (protocol, domain, path, query). Used to identify unique pages and track page-level metrics. OverviewUser FlowHeatmaps
page_path URL path without domain or query string (e.g., /products/shoes). Normalizes URLs for aggregation across domains. Top PagesFunnels
page_hash URL fragment/anchor (e.g., #section-2). Tracks in-page navigation for single-page apps and anchor links. User FlowEvents
page_query Query string parameters (e.g., ?id=123&sort=price). Used for UTM extraction and filtering dynamic pages. UTMSecurity
page_title HTML document title (<title> tag). Provides human-readable page names in reports. Top PagesUser Flow
page_referrer Previous page URL (document.referrer). Tracks where visitors came from, powers referrer reports and entry page analysis. OverviewUser Flow
page_charset Document character encoding (e.g., UTF-8). Helps diagnose encoding issues affecting analytics. Diagnostics
page_compat_mode Document rendering mode (CSS1Compat or BackCompat). Indicates standards vs. quirks mode. Diagnostics

UTM UTM & Attribution

15 Elements

Marketing attribution parameters from URL query strings. Track campaign performance across all major advertising platforms.

Element Description Used In
utm_source Traffic source identifier (e.g., google, newsletter, facebook). Primary dimension for campaign analysis. OverviewCampaigns
utm_medium Marketing medium (e.g., cpc, email, social). Groups traffic by marketing channel type. OverviewCampaigns
utm_campaign Campaign name (e.g., spring_sale, product_launch). Tracks specific marketing initiatives. Campaigns
utm_term Paid search keywords. Tracks which search terms drove the visit. Campaigns
utm_content Ad content or A/B test variant. Differentiates similar ads or links within a campaign. Campaigns
utm_id Campaign ID for CRM integration. Links web analytics to external marketing systems. Campaigns
gclid Google Ads click identifier. Enables Google Ads conversion tracking and attribution. Google Ads
fbclid Facebook click identifier. Tracks visits from Facebook ads and posts. Facebook Ads
msclkid Microsoft Advertising click identifier. Tracks Bing Ads attribution. Microsoft Ads
dclid DoubleClick click identifier. Tracks Display & Video 360 campaigns. Display Ads
twclid Twitter/X click identifier. Tracks visits from Twitter ads. Twitter Ads
ttclid TikTok click identifier. Tracks TikTok advertising attribution. TikTok Ads
li_fat_id LinkedIn first-party ad tracking ID. Tracks LinkedIn advertising campaigns. LinkedIn Ads
wbraid / gbraid Google web-to-app and app-to-web tracking. Tracks cross-platform Google campaigns. Google Ads
ref / affiliate / partner Generic referral, affiliate, and partner tracking parameters. Supports custom attribution programs. Affiliates

Client Client Environment

6 Elements

Browser, operating system, and device type information. Powers device segmentation and compatibility analysis.

Element Description Used In
client_device Device category: desktop, mobile, or tablet. Determined from user agent and screen size. Essential for responsive design analysis. OverviewDevices
client_browser Browser name and version (e.g., Chrome 120, Safari 17). Tracks browser market share and compatibility. OverviewDevices
client_os Operating system (e.g., Windows 11, macOS, iOS 17, Android 14). Informs platform-specific optimizations. OverviewDevices
client_language Primary browser language (e.g., en-US, es-MX). Used for localization analysis and content recommendations. Geography
client_languages All accepted languages in preference order (comma-separated). Shows multilingual audience segments. Geography
client_timezone IANA timezone identifier (e.g., America/New_York). Enables time-of-day analysis in visitor's local time. GeographyPredictive
client_timezone_offset UTC offset in minutes (e.g., -300 for EST). Used for timezone grouping and scheduling optimization. Predictive

Screen Screen & Display

10 Elements

Display characteristics including viewport, screen dimensions, and pixel density. Critical for responsive design and heatmap accuracy.

Element Description Used In
viewport_width Browser viewport width in CSS pixels. The actual visible content area, excluding scrollbars. HeatmapsDevices
viewport_height Browser viewport height in CSS pixels. Used to calculate scroll depth and fold position. HeatmapsScroll
screen_width Total screen width in pixels. Physical display resolution (may differ from viewport). Devices
screen_height Total screen height in pixels. Full display height including system UI. Devices
screen_avail_width Available screen width (excluding taskbar/dock). Usable space for browser windows. Devices
screen_avail_height Available screen height (excluding taskbar/dock). Helps understand actual browsing context. Devices
screen_color_depth Color depth in bits (typically 24 or 32). Indicates display color capabilities. Devices
screen_pixel_depth Pixel depth in bits. Usually matches color depth on modern displays. Devices
screen_pixel_ratio Device pixel ratio (DPR). Values >1 indicate HiDPI/Retina displays. Important for image optimization. DevicesPerformance
screen_orientation Screen orientation: portrait-primary, landscape-primary, etc. Tracks mobile device usage patterns. DevicesHeatmaps

Hardware Hardware

4 Elements

Device hardware capabilities that affect performance and interaction patterns.

Element Description Used In
hw_cpu_cores Number of logical CPU cores (navigator.hardwareConcurrency). Indicates device processing power for performance optimization. PerformanceDevices
hw_device_memory Approximate device RAM in GB (navigator.deviceMemory). Helps segment high vs. low-end devices. PerformanceDevices
hw_touch_enabled Whether touch input is available. Distinguishes touch-capable devices from mouse-only. DevicesHeatmaps
hw_max_touch_points Maximum simultaneous touch points supported. Indicates multi-touch capability for gesture support. Devices

Capabilities Browser Capabilities

7 Elements

Browser feature detection for compatibility analysis and user experience optimization.

Element Description Used In
cap_cookies_enabled Whether cookies are enabled. SiteData works without cookies, but this helps understand visitor privacy settings. Devices
cap_do_not_track Whether Do Not Track header is set. Respecting DNT is configurable in SiteData settings. Privacy
cap_pdf_viewer Whether browser has built-in PDF viewer. Affects document download vs. view behavior. Devices
cap_webgl Whether WebGL is supported. Indicates 3D graphics capability for rich content. Devices
cap_local_storage Whether localStorage is available. Required for visitor ID persistence. Diagnostics
cap_session_storage Whether sessionStorage is available. Used for session-scoped data. Diagnostics
cap_ad_blocker Whether an ad blocker is detected. Helps understand tracking script blocking rates. DevicesDiagnostics

Network Network

4 Elements

Network connection information from the Network Information API. Enables performance optimization for different connection types.

Element Description Used In
net_connection_type Effective connection type: 4g, 3g, 2g, or slow-2g. Indicates network speed category. PerformanceDevices
net_downlink Estimated download bandwidth in Mbps. Actual network throughput measurement. Performance
net_rtt Estimated round-trip time in milliseconds. Network latency indicator. Performance
net_save_data Whether data saver mode is enabled. Signals user preference for reduced data usage. Performance

Context Session Context

5 Elements

Contextual information about the visitor's session and browsing history on your site.

Element Description Used In
ctx_page_view_num Page view number within the current session (1, 2, 3...). Tracks session depth and engagement. User FlowFunnels
ctx_is_new_visitor Whether this is the visitor's first ever visit. Distinguishes new vs. returning visitors. OverviewPredictive
ctx_visit_count Total number of visits by this visitor. Measures visitor loyalty and return rate. User FlowPredictive
ctx_entry_page First page viewed in the current session. Tracks landing page performance. User FlowFunnels
ctx_is_entry Whether this pageview is a session entry point. Used to count unique sessions per page. User Flow

Performance Performance Metrics

17 Elements

Page load timing and Core Web Vitals from the Performance API. Essential for performance monitoring and optimization.

Page Load Timing

Element Description Used In
perf_dns DNS lookup time in milliseconds. Time to resolve domain name to IP address. Performance
perf_tcp TCP connection time in milliseconds. Time to establish connection to server. Performance
perf_ttfb Time to First Byte in milliseconds. Server response time, key server-side metric. Performance
perf_download Content download time in milliseconds. Time to receive the HTML document. Performance
perf_dom_ready DOMContentLoaded time in milliseconds. When HTML is fully parsed and DOM is ready. Performance
perf_load Full page load time in milliseconds. When all resources (images, scripts) are loaded. Performance
perf_dom_interactive DOM interactive time in milliseconds. When page becomes interactive. Performance
perf_nav_type Navigation type: navigate, reload, back_forward, prerender. Indicates how user arrived. PerformanceUser Flow
perf_redirect_count Number of redirects before reaching the page. Excessive redirects slow load times. Performance
perf_transfer_size Total bytes transferred for the page. Includes headers and compression overhead. Performance
perf_encoded_body_size Compressed payload size in bytes. Size of response body as received. Performance
perf_decoded_body_size Uncompressed payload size in bytes. Actual content size after decompression. Performance

Core Web Vitals

Element Description Used In
perf_lcp Largest Contentful Paint in milliseconds. Time until largest content element is visible. Good: <2500ms. PerformanceSEO
perf_fid First Input Delay in milliseconds. Time from first interaction to browser response. Good: <100ms. PerformanceSEO
perf_cls Cumulative Layout Shift score (0-1+). Measures visual stability. Good: <0.1. PerformanceSEO
perf_fcp First Contentful Paint in milliseconds. Time until first content is rendered. Good: <1800ms. PerformanceSEO
Core Web Vitals Impact: LCP, FID, and CLS are Google ranking factors. SiteData tracks these automatically to help improve your search rankings.

Geography Geography

10 Elements

Location data from CloudFront edge servers. No IP addresses are stored—geography is determined at the edge and only the location is recorded.

Element Description Used In
geo_country_code ISO 3166-1 alpha-2 country code (e.g., US, GB, DE). Always available, 99%+ accuracy. GeographyOverview
geo_country_name Full country name (e.g., United States, Germany). Human-readable country identifier. Geography
geo_region_code State/province code (e.g., TX, CA, ON). ISO 3166-2 subdivision code. Geography
geo_region_name Full state/province name (e.g., Texas, Ontario). Human-readable region name. Geography
geo_city City name (e.g., Austin, Toronto). Available for ~80% of traffic. Geography
geo_postal_code Postal/ZIP code (e.g., 78701). Available for some regions, useful for local targeting. Geography
geo_timezone IANA timezone from CloudFront (e.g., America/Chicago). Server-side timezone determination. Geography
geo_latitude Approximate latitude coordinate. City-level precision, not exact location. Geography
geo_longitude Approximate longitude coordinate. City-level precision, not exact location. Geography
geo_metro_code Nielsen DMA code (US only). Designated Market Area for TV/media targeting. Geography

Identity Identity & Privacy

3 Elements

Anonymous identifiers that enable tracking without collecting personal information.

Element Description Used In
ip_hash SHA-256 hash of the partial IP address (first 3 octets for IPv4). Original IP is never stored. Used only for session continuity. Session
user_agent Browser User-Agent string. Contains browser, OS, and device info. Used for device detection and bot filtering. DevicesSecurity
data_json Custom event data as JSON string. Stores any additional data you send with custom events (e.g., {"productId": "123"}). EventsFunnels
Privacy by Design: SiteData does not collect email addresses, names, phone numbers, or any personally identifiable information. IP addresses are hashed and truncated before storage, making them impossible to reverse.

Security Security Analysis

Derived from collected data

Security threat detection uses existing data elements to identify OWASP vulnerability probing and suspicious activity. No additional data is collected.

Analysis Type Source Elements Detection Purpose
SQL Injection page_query, page_url Detects SQL injection attempts in query strings (e.g., UNION SELECT, OR 1=1)
XSS (Cross-Site Scripting) page_query, page_url Identifies script injection attempts (e.g., <script>, javascript:)
Path Traversal page_path, page_query Detects directory traversal (e.g., ../../../etc/passwd)
Command Injection page_query Identifies shell command injection (e.g., ; cat /etc/passwd)
LFI/RFI page_query Local/Remote File Inclusion attempts (e.g., php://filter)
SSRF page_query Server-Side Request Forgery (e.g., url=http://169.254...)
Log4Shell page_query, user_agent Log4j exploitation attempts (e.g., ${jndi:ldap://)
Scanner Detection user_agent, page_path Identifies vulnerability scanners (Nmap, Nikto, SQLMap, etc.)
Admin Probing page_path Unauthorized admin area access attempts (/wp-admin, /phpmyadmin)
Threat Scoring: Each suspicious request receives a threat score (0-100) based on pattern severity. High scores (>75) trigger alerts. View results in the Security tab.

Data Retention

SiteData stores your analytics data based on your subscription plan:

  • Basic Plan: 30 days of detailed data, 12 months of aggregates
  • Pro Plan: 90 days of detailed data, 24 months of aggregates
  • Enterprise Plan: 365 days of detailed data, unlimited aggregates

Raw event data is stored in compressed Parquet format for efficient querying. You can export your data at any time from the dashboard.