API Performance Metrics Documentation

Overview

Aptori’s testing process involves sending numerous API requests to assess how the target system handles different types of queries. Many of these requests use fuzzed values to test API robustness, which means that not all calls will be successful. By incorporating randomized inputs, Aptori evaluates the API's overall posture, including how it responds to both valid and invalid requests. The performance metrics captured include both successful API calls and those that return error responses (e.g., 400 status codes). This comprehensive analysis helps organizations identify vulnerabilities, bottlenecks, and areas for optimization in their API infrastructure.

This section outlines the key performance metrics tracked for HTTP requests made to each API operation, providing insights into response times.

Tracked Metrics

The following key performance metrics are collected and analyzed:

1. Request Count

The total number of HTTP requests sent to the API operation.

2. Failed Request Count

The number of HTTP requests sent to the API operation that resulted in a non-200 response code.

3. Time to First Byte (TTFB)

The time taken for the first byte of the response to be received after the request is sent.
Indicates server processing efficiency and network latency.
Factors Influencing TTFB:
- Server Processing Time: The time taken by the backend system to process the request and generate a response.
- Network Latency: The delay in network transmission between the client and server, influenced by geographic distance and network congestion.
- Caching Mechanisms: Use of caching at various levels (browser, CDN, API Gateway) can significantly reduce TTFB.
- Server Load and Performance: High server loads can increase TTFB due to resource contention and queuing delays.
Metrics Recorded:
- Minimum (Min): The shortest recorded time to first byte.
- Maximum (Max): The longest recorded time to first byte.
- 90th Percentile (P90): The time under which 90% of requests fall.
- 95th Percentile (P95): The time under which 95% of requests fall.
- 99th Percentile (P99): The time under which 99% of requests fall.

4. Response Time

The total time taken to receive the full response from the API after the request is sent.
Represents the time required for the API to process the request and deliver the complete response.
Factors Affecting Response Time:
- Backend Processing: Time spent in application logic, database queries, or third-party integrations.
- Payload Size: Larger response payloads require more time to transfer and process.
- Network Conditions: Network congestion, bandwidth, and geographic distance impact overall response time.
- Concurrency Handling: High levels of concurrent requests may lead to resource contention and slower responses.
Metrics Recorded:
- Minimum (Min): The shortest recorded response time.
- Maximum (Max): The longest recorded response time.
- 90th Percentile (P90): The response time under which 90% of requests fall.
- 95th Percentile (P95): The response time under which 95% of requests fall.
- 99th Percentile (P99): The response time under which 99% of requests fall.

Understanding Percentiles

A percentile is a statistical measure used to understand the distribution of a dataset. In the context of API performance metrics, percentiles help analyze response times and Time to First Byte (TTFB) by providing insights into how most requests behave rather than focusing on extreme values.

How Percentiles Work

A percentile indicates the value below which a certain percentage of observations fall. For example, if the 90th percentile (P90) response time is 200ms, it means that 90% of the API requests completed in 200ms or less, while the slowest 10% took longer.

Common Percentiles Used in API Metrics

90th Percentile (P90): 90% of requests completed within this time, representing a general upper bound for good performance.
95th Percentile (P95): 95% of requests are completed within this time, helping to identify near-worst-case response times.
99th Percentile (P99): 99% of requests are completed within this time, highlighting the extreme outliers in performance.

Why Percentiles Matter

Better Than Averages: Averages can be misleading, as a few slow responses can distort the data. Percentiles provide a clearer picture of real-world user experience.
Focus on Performance Outliers: Higher percentiles (P95, P99) highlight worst-case performance scenarios, crucial for identifying potential API bottlenecks.
Actionable Optimization: If P90 is low but P99 is high, it suggests occasional extreme delays that need investigation.

Use Cases

Performance Monitoring: Identifying slow API endpoints and optimizing request handling to maintain efficiency.
Scalability Planning: Ensuring API performance remains stable under increased load and predicting infrastructure needs.
User Experience Optimization: Reducing response times to improve overall application responsiveness and user satisfaction.
Incident Response: Quickly detecting and responding to API performance degradation before it affects users.
Capacity Management: Ensuring sufficient resources are allocated to handle expected API traffic.
Compliance and SLAs: Ensuring adherence to service level agreements (SLAs) by monitoring key performance indicators.

Best Practices for API Performance Optimization

Caching Responses: Implement caching strategies to reduce redundant API calls and enhance response times.
Load Balancing: Distribute requests evenly across multiple API instances to prevent server overload.
Asynchronous Processing: Use asynchronous processing for time-consuming operations to improve responsiveness.
Database Query Optimization: Optimize database queries to reduce query execution time.
Compression: Use gzip or Brotli compression to minimize response payload size.
API Gateway Optimization: Configure API gateways efficiently to enhance performance and security.

Conclusion

API performance monitoring, through metrics like time to first byte and response times, is key to maintaining high-performing and scalable applications.