前言:虽然有翻译软件,虽然有chatgpt,毕竟语言隔阂,对这个工具还是一知半解,因此想通过翻译的方式和大家来一起学习下Perfetto这个强大的工具
目录
#####################以下分割线#####################
英文原文在这里
Perfetto可以从Android和Linux上收集大量内存相关的事件和计数器。这些事件来自内核接口,包括ftrace和/proc接口,有两种类型:轮询计数器和内核在ftrace缓冲区中推送的事件。
进程轮询计数器
进程统计数据源允许以用户定义的间隔轮询/proc/<pid>/status和/proc/<pid>/oom_score_adj。
参见man 5 proc了解其语义。
UI
SQL
select c.ts, c.value, t.name as counter_name, p.name as proc_name, p.pid
from counter as c left join process_counter_track as t on c.track_id = t.id
left join process as p using (upid)
where t.name like 'mem.%'
TraceConfig
如果需要每X毫秒收集一次进程统计计数器,请在进程统计配置中设置proc_stats_poll_ms=X。X必须大于100ms,以避免过度使用CPU。有关正在收集的特定计数器的详细信息,请参阅ProcessStats reference.。
data_sources: {
config {
name: "linux.process_stats"
process_stats_config {
scan_all_processes_on_start: true
proc_stats_poll_ms: 1000
}
}
}
进程内存事件 (ftrace)
rss_stat
最新版本的Linux内核允许在Resident Set Size(RSS)mm计数器更新时通知ftrace事件。这是/proc/pid/status即VmRSS使用相同的计数器。rss_stat事件的主要优点是,作为一个事件驱动的推送事件,它可以检测非常短的内存使用突发,而使用/proc计数器将无法检测到这些突发。
数百MB的内存使用峰值可能会对安卓系统产生巨大的负面影响,即使它们只持续几毫秒,因为它们可能会导致大量内存不足,从而回收内存。
这个功能在 b3d1411b6中的Linux内核中引入,后来通过e4dcad20进行了改进。它们在Linux v5.5-rc1之后的之后版本中可用。该补丁已在运行Android 10(Q)的几个Google Pixel内核中进行了移植。
mm_event
mm_event是一个ftrace事件,它捕获关键内存事件(/proc/vmstat公开的事件的子集)的统计信息。与RSS统计计数器更新不同,mm事件的数量非常大,单独跟踪它们是不可行的。mm_event只报告跟踪中的周期性直方图,从而显著降低了开销。
mm_event仅在一些运行Android 10(Q)及更高版本的Google Pixel内核上可用。
启用mm_event时,将记录以下mm事件类型:
mem.mm.min_flt: Minor page faults
mem.mm.maj_flt: Major page faults
mem.mm.swp_flt: Page faults served by swapcache
mem.mm.read_io: Read page faults backed by I/O
mem.mm..compaction: Memory compaction events
mem.mm.reclaim: Memory reclaim events
对于每种事件类型,事件记录:
count:自上次事件以来,事件发生的次数。
min_lat:自上次事件以来记录的最小延迟(mm事件的持续时间)。
max_lat:自上次事件以来记录的最高延迟。
UI
SQL
在SQL级别,这些事件的导入和导出方式与相应的轮询事件相同。这允许收集两种类型的事件(推送和轮询),并在查询和脚本中统一处理它们
select c.ts, c.value, t.name as counter_name, p.name as proc_name, p.pid
from counter as c left join process_counter_track as t on c.track_id = t.id
left join process as p using (upid)
where t.name like 'mem.%'
TraceConfig
要启用对LMK的跟踪,请在trace config中添加以下选项:
data_sources: {
config {
name: "linux.ftrace"
ftrace_config {
ftrace_events: "kmem/rss_stat"
ftrace_events: "mm_event/mm_event_record"
}
}
}
# This is for getting Thread<>Process associations and full process names.
data_sources: {
config {
name: "linux.process_stats"
}
}
全系统轮询计数器
此数据源允许从以下位置定期轮询系统数据:
/proc/stat
/proc/vmstat
/proc/meminfo`
参见 man 5 proc了解其语义。
UI
可以在trace config中设置要包含在trace中的轮询周期和特定计数器。
SQL
select c.ts, t.name, c.value / 1024 as value_kb from counters as c left join counter_track as t on c.track_id = t.id
TraceConfig
在TraceConfig reference查看所支持的计数器
data_sources: {
config {
name: "linux.sys_stats"
sys_stats_config {
meminfo_period_ms: 1000
meminfo_counters: MEMINFO_MEM_TOTAL
meminfo_counters: MEMINFO_MEM_FREE
meminfo_counters: MEMINFO_MEM_AVAILABLE
vmstat_period_ms: 1000
vmstat_counters: VMSTAT_NR_FREE_PAGES
vmstat_counters: VMSTAT_NR_ALLOC_BATCH
vmstat_counters: VMSTAT_NR_INACTIVE_ANON
vmstat_counters: VMSTAT_NR_ACTIVE_ANON
stat_period_ms: 1000
stat_counters: STAT_CPU_TIMES
stat_counters: STAT_FORK_COUNT
}
}
}
LMK
背景
在内存紧张的时候 ,Android framework会杀死应用程序和服务,尤其是后台应用程序,以便在需要内存时为新打开的应用程序腾出空间。这些被称为LMK。
值得注意的是,发生LMK并不意味着有性能问题。经验法则是,严重程度(如:用户感知的影响)与应用程序被杀的状态成正比。应用程序状态可以从OOM调整分数的轨迹中导出。
LMK杀死前台应用程序或服务通常是一个大问题。因为这意味着用户在操作app时,忽然app 就被杀掉了,或者正在播放的音乐忽然就被停止了。
相反,如果是缓存的应用程序或服务被LMK,这是通常发生的情况。在大多数情况下用户并不会注意到,被缓存的应用程序被杀掉,只是回到应用程序的时候发生一次冷启动。
介于这两个极端之间的情况更加微妙。如果短时间之内,缓存的应用程序/服务都被LMK杀掉(即观察到大多数进程在短时间内得到LMK),并且通常是系统某些组件导致内存峰值的表现,那么它可能是个问题。
lowmemorykiller vs lmkd
内核LMK 驱动
在Android中,LMK曾经由一个特殊的内核驱动程序处理,即Linux的drivers/staging/android/lowmemorykiller.c。该驱动程序会在trace中发出ftrace事件lowmemorykill/lowmemory_kill。
用户空间的lmkd
安卓9引入了一个用户空间的原生守护进程,接管了LMK的职责:lmkd。并不是所有运行Android 9的设备都必须使用lmkd,因为内核内与用户空间的最终选择取决于手机制造商使用内核版本和内核配置。
在谷歌Pixel手机上,自从Pixel 2运行Android 9以来,就使用了lmkd。
看见https://source.android.com/devices/tech/perf/lmkd详细信息。
lmkd发出一个名为kill_one_process的用户空间的atrace计数器事件。
android 的lmk和Linux 的oomkiller
安卓系统上的LMK,无论是旧的内核lmk还是新的lmkd,都使用与标准Linux内核的OOM Killer完全不同的机制。Perfetto目前只支持Android LMK事件(在内核和用户空间中),不支持跟踪Linux内核OOM Killer事件。Linux OOM Killer事件理论上在Android上仍然是可能的,但极不可能发生。如果发生这种情况,则更有可能是BSP配置错误的症状。
UI
较新的用户空间LMK以计数器的形式出现在lmkd 轨迹下的UI中。计数器值是终止进程的PID(在下面的示例中,PID=27985)。
TODO: we are working on a better UI support for LMKs.
SQL
较新的lmkd和旧版本内核驱动的LMK事件,都会在导入时进行格式化,并在instants表中的mem.lmk键下可用。
SELECT ts, process.name, process.pid
FROM instant
JOIN process_track ON instant.track_id = process_track.id
JOIN process USING (upid)
WHERE instant.name = 'mem.lmk'
TraceConfig
要启用对LMK的跟踪,请在trace config中添加以下选项:
data_sources: {
config {
name: "linux.ftrace"
ftrace_config {
# For old in-kernel events.
ftrace_events: "lowmemorykiller/lowmemory_kill"
# For new userspace lmkds.
atrace_apps: "lmkd"
# This is not strictly required but is useful to know the state
# of the process (FG, cached, ...) before it got killed.
ftrace_events: "oom/oom_score_adj_update"
}
}
}
app运行状态和OOM分数调整
Android应用程序状态可以在进程oom_score_adj的跟踪中推断出来。但映射不是1:1,状态比oom_score_adj值多,缓存进程的oom_score _adj范围从900到1000。
映射可以从 ActivityManager's ProcessList sources找到
// so it can be killed without any disruption.
static final int CACHED_APP_MAX_ADJ = 999;
static final int CACHED_APP_MIN_ADJ = 900;
// This is the oom_adj level that we allow to die first. This cannot be equal to
// CACHED_APP_MAX_ADJ unless processes are actively being assigned an oom_score_adj of
// CACHED_APP_MAX_ADJ.
static final int CACHED_APP_LMK_FIRST_ADJ = 950;
// The B list of SERVICE_ADJ -- these are the old and decrepit
// services that aren't as shiny and interesting as the ones in the A list.
static final int SERVICE_B_ADJ = 800;
// This is the process of the previous application that the user was in.
// This process is kept above other things, because it is very common to
// switch back to the previous app. This is important both for recent
// task switch (toggling between the two top recent apps) as well as normal
// UI flow such as clicking on a URI in the e-mail app to view in the browser,
// and then pressing back to return to e-mail.
static final int PREVIOUS_APP_ADJ = 700;
// This is a process holding the home application -- we want to try
// avoiding killing it, even if it would normally be in the background,
// because the user interacts with it so much.
static final int HOME_APP_ADJ = 600;
// This is a process holding an application service -- killing it will not
// have much of an impact as far as the user is concerned.
static final int SERVICE_ADJ = 500;
// This is a process with a heavy-weight application. It is in the
// background, but we want to try to avoid killing it. Value set in
// system/rootdir/init.rc on startup.
static final int HEAVY_WEIGHT_APP_ADJ = 400;
// This is a process currently hosting a backup operation. Killing it
// is not entirely fatal but is generally a bad idea.
static final int BACKUP_APP_ADJ = 300;
// This is a process bound by the system (or other app) that's more important than services but
// not so perceptible that it affects the user immediately if killed.
static final int PERCEPTIBLE_LOW_APP_ADJ = 250;
// This is a process only hosting components that are perceptible to the
// user, and we really want to avoid killing them, but they are not
// immediately visible. An example is background music playback.
static final int PERCEPTIBLE_APP_ADJ = 200;
// This is a process only hosting activities that are visible to the
// user, so we'd prefer they don't disappear.
static final int VISIBLE_APP_ADJ = 100;
// This is a process that was recently TOP and moved to FGS. Continue to treat it almost
// like a foreground app for a while.
// @see TOP_TO_FGS_GRACE_PERIOD
static final int PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ = 50;
// This is the process running the current foreground app. We'd really
// rather not kill it!
static final int FOREGROUND_APP_ADJ = 0;
// This is a process that the system or a persistent process has bound to,
// and indicated it is important.
static final int PERSISTENT_SERVICE_ADJ = -700;
// This is a system persistent process, such as telephony. Definitely
// don't want to kill it, but doing so is not completely fatal.
static final int PERSISTENT_PROC_ADJ = -800;
// The system process runs at the default adjustment.
static final int SYSTEM_ADJ = -900;
// Special code for native processes that are not being managed by the system (so
// don't have an oom adj assigned by the system).
static final int NATIVE_ADJ = -1000;
#####################以上分割线#####################
后记:
1 本次主要使用百度翻译,虽然被骂,但至少翻译这个工具降低了门槛。
2 英文文档中的长难句真的是又长又难,基于百度的翻译,然后自己再调整下,水平实在有限。
3 技术背景知识不够,有些专有名词不知道怎么翻译,也不知道百度翻译的是否准确,功夫在诗外。
4 万事开头难,中间难不难,还不知道。中间的事后面再说,正确一天翻译一篇。
5 虽然可能会有人不屑,但总要有人去做不起眼的小事。
6 google 厉害,这个perfetto 工具也很厉害。君子善假于物也。
7 工具的使用是最简单的入门,背后还有更多的东西值得学习。
8 水平实在有限,闻过则喜,希望有更多的人反馈,期待更好的建议