Skip to main content
Calico Cloud documentation

Flow log data types

Big picture

Calico Cloud sends the following data to Elasticsearch.

The following table details the key/value pairs in the JSON blob, including their Elasticsearch datatype.

NameDatatypeDescription
hostkeywordName of the node that collected the flow log entry.
start_timedateStart time of log collection in UNIX timestamp format.
end_timedateEnd time of log collection in UNIX timestamp format.
actionkeyword- allow: Calico Cloud accepted the flow.
- deny: Calico Cloud denied the flow.
bytes_inlongNumber of incoming bytes since the last export.
bytes_outlongNumber of outgoing bytes since the last export.
dest_ipipIP address of the destination pod. A null value indicates aggregation.
dest_namekeywordContains one of the following values:
- Name of the destination pod.
- Name of the pod that was aggregated or the endpoint is not a pod. Check dest_name_aggr for more information, such as the name of the pod if it was aggregated.
dest_name_aggrkeywordContains one of the following values:
- Aggregated name of the destination pod.
- pvt: endpoint is not a pod. Its IP address belongs to a private subnet.
- pub: endpoint is not a pod. Its IP address does not belong to a private subnet. It is probably an endpoint on the public internet.
dest_namespacekeywordNamespace of the destination endpoint. A - means the endpoint is not namespaced.
dest_portlongDestination port. Not applicable for ICMP packets.
dest_service_namekeywordName of the destination service. A - means the original destination did not correspond to a known Kubernetes service (e.g. a services ClusterIP).
dest_service_namespacekeywordNamespace of the destination service. A - means the original destination did not correspond to a known Kubernetes service (e.g. a services ClusterIP).
dest_service_portkeywordPort name of the destination service.
A - means :
- the original destination did not correspond to a known Kubernetes service (e.g. a services ClusterIP), or
- the destination port is aggregated.
A * means there are multiple service port names matching the destination port number.
dest_typekeywordDestination endpoint type. Possible values:
- wep: A workload endpoint, a pod in Kubernetes.
- ns: A Networkset. If multiple Networksets match, then the one with the longest prefix match is chosen.
- net: A Network. The IP address did not fall into a known endpoint type.
dest_labelsarray of keywordsLabels applied to the destination pod. A hyphen indicates aggregation.
dest_domainsarray of keywordsFind all the destination domain names for use in a DNS policy by examining dest_domains. The field displays information on the top-level domains linked to the destination IP. Applies to flows reported from the source to destinations outside the cluster. If flowLogsDestDomainsByClient is disabled, having dest_domains: ["A"] doesn't guarantee that the flow corresponds to a connection with domain name A. The destination IP may also be linked to other domain names not yet captured by Calico.
reporterkeyword- src: flow came from the pod that initiated the connection.
- dst: flow came from the pod that received the initial connection.
num_flowslongNumber of flows aggregated into this entry during this export interval.
num_flows_completedlongNumber of flows that were completed during the export interval.
num_flows_startedlongNumber of flows that were started during the export interval.
num_process_nameslongNumber of unique process names aggregated into this entry during this export interval.
num_process_idslongNumber of unique process ids aggregated into this entry during this export interval.
num_process_argslongNumber of unique process args aggregated into this entry during this export interval.
nat_outgoing_portsarray of intsList of NAT outgoing ports for the packets that were Source NAT'd in the flow
packets_inlongNumber of incoming packets since the last export.
packets_outlongNumber of outgoing packets since the last export.
protokeywordProtocol.
policiesarray of keywordsList of policies that interacted with this flow. See Format of the policies field.
process_namekeywordThe name of the process that initiated or received the connection or connection request. This field will have the executable path if flowLogsCollectProcessPath is enabled. A "-" indicates that the process name is not logged. A "*" indicates that the per flow process limit has exceeded and the process names are now aggregated.
process_idkeywordThe process ID of the corresponding process (indicated by the process_name field) that initiated or received the connection or connection request. A "-" indicates that the process ID is not logged. A "*" indicates that there are more than one unique process IDs for the corresponding process name.
process_argsarray of stringsThe arguments with which the executable was invoked. The size of the list depends on the per flow process args limit.
source_ipipIP address of the source pod. A null value indicates aggregation.
source_namekeywordContains one of the following values:
- Name of the source pod.
- Name of the pod that was aggregated or the endpoint is not a pod. Check source_name_aggr for more information, such as the name of the pod if it was aggregated.
source_name_aggrkeywordContains one of the following values:
- Aggregated name of the source pod.
- pvt: Endpoint is not a pod. Its IP address belongs to a private subnet.
- pub: the endpoint is not a pod. Its IP address does not belong to a private subnet. It is probably an endpoint on the public internet.
source_namespacekeywordNamespace of the source endpoint. A - means the endpoint is not namespaced.
source_portlongSource port. A null value indicates aggregation.
source_typekeywordThe type of source endpoint. Possible values:
- wep: A workload endpoint, a pod in Kubernetes.
- ns: A Networkset. If multiple Networksets match, then the one with the longest prefix match is chosen.
- net: A Network. The IP address did not fall into a known endpoint type.
source_labelsarray of keywordsLabels applied to the source pod. A hyphen indicates aggregation.
original_source_ipsarray of ipsList of external IP addresses collected from requests made to the cluster through an ingress resource. This field is only available if capturing external IP addresses is configured.
num_original_source_ipslongNumber of unique external IP addresses collected from requests made to the cluster through an ingress resource. This count includes the IP addresses included in the original_source_ips field. This field is only available if capturing external IP addresses is configured.
tcp_mean_send_congestion_windowlongMean tcp send congestion window size. This field is only available if flowLogsEnableTcpStats is enabled
tcp_min_send_congestion_windowlongMinimum tcp send congestion window size. This field is only available if flowLogsEnableTcpStats is enabled
tcp_mean_smooth_rttlongMean smooth RTT in micro seconds. This field is only available if flowLogsEnableTcpStats is enabled
tcp_max_smooth_rttlongMaximum smooth RTT in micro seconds. This field is only available if flowLogsEnableTcpStats is enabled
tcp_mean_min_rttlongMean MinRTT in micro seconds. This field is only available if flowLogsEnableTcpStats is enabled
tcp_max_min_rttlongMaximum MinRTT in micro seconds. This field is only available if flowLogsEnableTcpStats is enabled
tcp_mean_msslongMean TCP MSS. This field is only available if flowLogsEnableTcpStats is enabled
tcp_min_msslongMinimum TCP MSS. This field is only available if flowLogsEnableTcpStats is enabled
tcp_total_retransmissionslongTotal retransmitted packets. This field is only available if flowLogsEnableTcpStats is enabled
tcp_lost_packetslongTotal lost packets. This field is only available if flowLogsEnableTcpStats is enabled
tcp_unrecovered_tolongTotal unrecovered timeouts. This field is only available if flowLogsEnableTcpStats is enabled

Format of the policies field

The policies field contains a comma-delimited list of policy rules that matched the flow. Each entry in the list has the following format:

<index>|<tier name>|<policy name>|<action>|<rule index>

Where,

  • <index> numbers the order in which the rules were hit, starting with 0.

    tip

    Sort the entries of the list by the <index> to see the order that rules were hit. The entries are displayed in random order due to the way they are stored in the datastore.

  • <tier name> is the name of the policy tier containing the policy, or __PROFILE__ for a rule derived from a Profile resource (this is the internal datatype used to represent a Kubernetes namespace and its associated "default allow" rule).

  • <policy name> is the name of the policy/profile; its format depends on the type of policy:

    • <tier>.<name> for Calico Cloud GlobalNetworkPolicy.
    • <namespace>/knp.default.<name> for Kubernetes NetworkPolicy.
    • <namespace>/<tier>.<name> for Calico Cloud NetworkPolicy.

    Staged policy names are prefixed with "staged:".

  • <action> is the action performed by the rule; one of allow, deny, pass.

  • <rule index> if non-negative, is the index of the rule that was matched within the policy, starting with 0. Otherwise, a special value:

    • -1 means the reporting endpoint was selected by the policy but no rule matched. The traffic hit the default action for the tier. In this case, the <policy name> is selected arbitrarily from the set of policies within the tier that apply to the endpoint.
    • -2 means "unknown". The rule index was not recorded.

Flow log example, with no aggregation

A flow log with aggregation level 0, no aggregation, might look like:

  {
"start_time": 1597166083,
"end_time": 1597166383,
"source_ip": "192.168.47.9",
"source_name": "access-6b687c8dcb-zn5s2",
"source_name_aggr": "access-6b687c8dcb-*",
"source_namespace": "policy-demo",
"source_port": 42106,
"source_type": "wep",
"source_labels": {
"labels": [
"pod-template-hash=6b687c8dcb",
"app=access"
]
},
"dest_ip": "192.168.138.79",
"dest_name": "nginx-86c57db685-h6792",
"dest_name_aggr": "nginx-86c57db685-*",
"dest_namespace": "policy-demo",
"dest_port": 80,
"dest_type": "wep",
"dest_labels": {
"labels": [
"pod-template-hash=86c57db685",
"app=nginx"
]
},
"proto": "tcp",
"action": "allow",
"reporter": "dst",
"policies": {
"all_policies": [
"0|default|policy-demo/default.access-nginx|allow"
]
},
"bytes_in": 388,
"bytes_out": 1113,
"num_flows": 1,
"num_flows_started": 1,
"num_flows_completed": 1,
"packets_in": 6,
"packets_out": 5,
"http_requests_allowed_in": 0,
"http_requests_denied_in": 0,
"original_source_ips": null,
"num_original_source_ips": 0,
"host": "bz-n8kf-kadm-node-1",
"@timestamp": 1597166383000
}

The log shows an incoming connection reported by the destination node, allowed by a policy on port 80. The start_time and end_time describe the aggregation period (5 min.) During this interval, one flow ("num_flow": 1) was recorded. At higher aggregation levels, flows from endpoints performing the same operation and originating from the same Deployment/ReplicaSet are grouped into a single log. In this example, the common source endpoints that are prefixed with access-6b687c8dcb-. Parameters like source_ip may be dropped and set to null, depending on the aggregation level. As aggregation levels increase, more flows will be grouped together based on your data. For more details on aggregation levels, see configure flow log aggregation.