2007 Commits

Author SHA1 Message Date
Peter Zijlstra
b9986c3837 sched/fair: Propagate an effective runnable_load_avg
The load balancer uses runnable_load_avg as load indicator. For
!cgroup this is:

  runnable_load_avg = \Sum se->avg.load_avg ; where se->on_rq

That is, a direct sum of all runnable tasks on that runqueue. As
opposed to load_avg, which is a sum of all tasks on the runqueue,
which includes a blocked component.

However, in the cgroup case, this comes apart since the group entities
are always runnable, even if most of their constituent entities are
blocked.

Therefore introduce a runnable_weight which for task entities is the
same as the regular weight, but for group entities is a fraction of
the entity weight and represents the runnable part of the group
runqueue.

Then propagate this load through the PELT hierarchy to arrive at an
effective runnable load avgerage -- which we should not confuse with
the canonical runnable load average.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Helium-Studio <67852324+Helium-Studio@users.noreply.github.com>
2024-08-12 21:00:31 +03:00
kondors1995
e9b310c8d7 sched:reland to f5fd7051d6 2024-08-12 20:59:24 +03:00
kondors1995
ebe2649ab6 Merge remote-tracking branch 'openela/linux-4.14.y' into 14.0-matrix 2024-08-11 18:42:43 +03:00
Steven Rostedt
8b76f73f51 ASoC: tracing: Export SND_SOC_DAPM_DIR_OUT to its value
[ Upstream commit 58300f8d6a48e58d1843199be743f819e2791ea3 ]

The string SND_SOC_DAPM_DIR_OUT is printed in the snd_soc_dapm_path trace
event instead of its value:

   (((REC->path_dir) == SND_SOC_DAPM_DIR_OUT) ? "->" : "<-")

User space cannot parse this, as it has no idea what SND_SOC_DAPM_DIR_OUT
is. Use TRACE_DEFINE_ENUM() to convert it to its value:

   (((REC->path_dir) == 1) ? "->" : "<-")

So that user space tools, such as perf and trace-cmd, can parse it
correctly.

Reported-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Fixes: 6e588a0d83 ("ASoC: dapm: Consolidate path trace events")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240416000303.04670cdf@rorschach.local.home
Signed-off-by: Mark Brown <broonie@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(cherry picked from commit 60c68092723ea420215e9c3d5530038bc6568739)
Signed-off-by: Vegard Nossum <vegard.nossum@oracle.com>
2024-07-08 08:15:06 +00:00
kondors1995
9701f583c7 Merge branch '14.0' into 14.0-matrix 2024-06-09 12:39:33 +03:00
kondors1995
a29626690a Revert "trace: sched: add capacity change tracing"
This reverts commit 92ed42c092.
2024-06-09 12:30:47 +03:00
Samuel Pascua
99dca81544 thermal: cpu_cooling: fix backport
Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com>
2024-04-25 10:39:31 +03:00
Thara Gopinath
360d1cba49 sched/pelt: Add support to track thermal pressure
Extrapolating on the existing framework to track rt/dl utilization using
pelt signals, add a similar mechanism to track thermal pressure. The
difference here from rt/dl utilization tracking is that, instead of
tracking time spent by a CPU running a RT/DL task through util_avg, the
average thermal pressure is tracked through load_avg. This is because
thermal pressure signal is weighted time "delta" capacity unlike util_avg
which is binary. "delta capacity" here means delta between the actual
capacity of a CPU and the decreased capacity a CPU due to a thermal event.

In order to track average thermal pressure, a new sched_avg variable
avg_thermal is introduced. Function update_thermal_load_avg can be called
to do the periodic bookkeeping (accumulate, decay and average) of the
thermal pressure.

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-2-thara.gopinath@linaro.org
2024-04-25 10:39:29 +03:00
Peter Zijlstra
3f5082bbb6 sched/fair: Propagate an effective runnable_load_avg
The load balancer uses runnable_load_avg as load indicator. For
!cgroup this is:

  runnable_load_avg = \Sum se->avg.load_avg ; where se->on_rq

That is, a direct sum of all runnable tasks on that runqueue. As
opposed to load_avg, which is a sum of all tasks on the runqueue,
which includes a blocked component.

However, in the cgroup case, this comes apart since the group entities
are always runnable, even if most of their constituent entities are
blocked.

Therefore introduce a runnable_weight which for task entities is the
same as the regular weight, but for group entities is a fraction of
the entity weight and represents the runnable part of the group
runqueue.

Then propagate this load through the PELT hierarchy to arrive at an
effective runnable load avgerage -- which we should not confuse with
the canonical runnable load average.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-25 10:39:28 +03:00
Samuel Pascua
5cd4cef17d thermal: cpu_cooling: fix backport
Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com>
2024-04-23 11:35:30 +03:00
Thara Gopinath
a7554f783b sched/pelt: Add support to track thermal pressure
Extrapolating on the existing framework to track rt/dl utilization using
pelt signals, add a similar mechanism to track thermal pressure. The
difference here from rt/dl utilization tracking is that, instead of
tracking time spent by a CPU running a RT/DL task through util_avg, the
average thermal pressure is tracked through load_avg. This is because
thermal pressure signal is weighted time "delta" capacity unlike util_avg
which is binary. "delta capacity" here means delta between the actual
capacity of a CPU and the decreased capacity a CPU due to a thermal event.

In order to track average thermal pressure, a new sched_avg variable
avg_thermal is introduced. Function update_thermal_load_avg can be called
to do the periodic bookkeeping (accumulate, decay and average) of the
thermal pressure.

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-2-thara.gopinath@linaro.org
2024-04-23 11:35:29 +03:00
Peter Zijlstra
43bec9d078 sched/fair: Propagate an effective runnable_load_avg
The load balancer uses runnable_load_avg as load indicator. For
!cgroup this is:

  runnable_load_avg = \Sum se->avg.load_avg ; where se->on_rq

That is, a direct sum of all runnable tasks on that runqueue. As
opposed to load_avg, which is a sum of all tasks on the runqueue,
which includes a blocked component.

However, in the cgroup case, this comes apart since the group entities
are always runnable, even if most of their constituent entities are
blocked.

Therefore introduce a runnable_weight which for task entities is the
same as the regular weight, but for group entities is a fraction of
the entity weight and represents the runnable part of the group
runqueue.

Then propagate this load through the PELT hierarchy to arrive at an
effective runnable load avgerage -- which we should not confuse with
the canonical runnable load average.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-23 11:35:27 +03:00
kondors1995
af30df53ee treeweide: Revert Pixel 8 cass backports
Someting caused issues with media playback

Squashed commit of the following:

commit 9b42dfb148a86d279550e50e7374c799a03a5a6f
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:41 2024 +0300

    Revert "sched/fair: Clean up calc_cfs_shares()"

    This reverts commit 09f624912f.

commit 8acb165e9e42219646fd3e6d6c3f51a6f702a686
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:32 2024 +0300

    Revert "sched/fair: Add comment to calc_cfs_shares()"

    This reverts commit f9d3ffd696.

commit 423c42eb6eb17ad7f3f33cdab7fdf1ff0f98dc79
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:32 2024 +0300

    Revert "sched/fair: Cure calc_cfs_shares() vs. reweight_entity()"

    This reverts commit 82b9ddd10d.

commit 1093143c06279c2d89b3241ac3e5abf339f7abcd
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:31 2024 +0300

    Revert "sched/fair: Remove se->load.weight from se->avg.load_sum"

    This reverts commit 8afedeb608.

commit 4504e6b8c2119e092e831879cbd9c6adaea2713a
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:31 2024 +0300

    Revert "sched: remove rt_rq utilization tracking for now"

    This reverts commit ca1bdfe9c2.

commit db5377a1b2868ffc6d3f33225353c33398e1de7b
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:30 2024 +0300

    Revert "sched/fair: Change update_load_avg() arguments"

    This reverts commit 561a960285.

commit d099da1d8854cbbdb8fca1c4eaf4f692f940a6c7
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:30 2024 +0300

    Revert "sched/fair: Move enqueue migrate handling"

    This reverts commit 4fe3a618af.

commit 9b8f11264e1bd21ecfd69b6c77e88d9893e20a98
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:29 2024 +0300

    Revert "sched/fair: Rename {en,de}queue_entity_load_avg()"

    This reverts commit 86597c5bf2.

commit 382d23f36c9c5430b0f96cbf379ded416bc1f330
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:29 2024 +0300

    Revert "sched/fair: Introduce {en,de}queue_load_avg()"

    This reverts commit 04ee68d87c.

commit 7d7c60a4d4ceb493f40703bf596ac949df31dd33
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:28 2024 +0300

    Revert "sched/fair: More accurate reweight_entity()"

    This reverts commit ec54bd1f93.

commit afcd60d3ee87de1d4f069acc713e2097d9b40be8
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:28 2024 +0300

    Revert "sched/fair: Use reweight_entity() for set_user_nice()"

    This reverts commit 0d5a173dfd.

commit af78e8c698df9201648634d7dd1e4804494a765d
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:27 2024 +0300

    Revert "sched/fair: Rewrite cfs_rq->removed_*avg"

    This reverts commit 7b0d11dc5a.

commit 2a44c05e1f96b14f80ea578b386af04eb24e5aac
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:27 2024 +0300

    Revert "sched/fair: Rewrite PELT migration propagation"

    This reverts commit dc3a4b3e44.

commit a876ca5f4b2937b8adc90b3f502376d45a891d54
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:26 2024 +0300

    Revert "sched/fair: Propagate an effective runnable_load_avg"

    This reverts commit 017d4c0519.

commit 8a0c9809f3fbabc583a485af142b40f0b9945a6d
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:26 2024 +0300

    Revert "sched/fair: Implement synchonous PELT detach on load-balance migrate"

    This reverts commit c240cbcefb.

commit 8d002da6ccedb08096a371826525e0e21587f469
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:25 2024 +0300

    Revert "sched/fair: Align PELT windows between cfs_rq and its se"

    This reverts commit 2b669872a6.

commit ceb6db696fe8210e0dc9ce1643fcea314ae84d68
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:25 2024 +0300

    Revert "sched/fair: Implement more accurate async detach"

    This reverts commit a5dce93362.

commit d6afb5da3f6ccc3c426ec6cef2c4b955b0127f79
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:25 2024 +0300

    Revert "sched/fair: Calculate runnable_weight slightly differently"

    This reverts commit 49af7a9c70.

commit 302db79364ce3cdd41e9b33f48a0d5a8c9818ded
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:24 2024 +0300

    Revert "sched/fair: Update calc_group_*() comments"

    This reverts commit 245ba7cfcb.

commit 61ec8d04a966c839f3f204e99e5006a1e42c816a
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:23 2024 +0300

    Revert "sched/pelt: Move PELT related code in a dedicated file"

    This reverts commit 05c4270dbb.

commit 494151c770c9afcf2f4231029bb35d337385c5f8
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:23 2024 +0300

    Revert "sched/rt: Add rt_rq utilization tracking"

    This reverts commit 1efbd97f33.

commit ee00f39c0c36fc5c7d17b9fc8f7a273acd537efe
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:22 2024 +0300

    Revert "sched: fix build"

    This reverts commit bead5679af.

commit c85ab7e00b755f7b0b008d0657ecc805755248e1
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:22 2024 +0300

    Revert "sched/dl: Add dl_rq utilization tracking"

    This reverts commit 1f48d4df0e.

commit d99ffef9a8069efde7bb85869280025b60519db6
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:21 2024 +0300

    Revert "sched/irq: Add IRQ utilization tracking"

    This reverts commit d1c9480337.

commit 2be20e582bf3129e9b572f6b39cef311d0fe3d15
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:20 2024 +0300

    Revert "sched/schedutil: add `schedutil_freq_util()`"

    This reverts commit f3e8cb0f44.

commit b495ca0e67d581fc7cc7d250d26d36fdeb8181c4
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:20 2024 +0300

    Revert "FROMLIST: sched: Relocate arch_scale_cpu_capacity"

    This reverts commit 844d670403.

commit 33c4a0dc0cfb20c02cdb705770f888343d11a1a0
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:19 2024 +0300

    Revert "sched: cpufreq: add map_util_freq()"

    This reverts commit 1f9df50075.

commit d2fedfac7610b0121e6338a8c64ac7414b5b5965
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:19 2024 +0300

    Revert "sched/pelt: Add support to track thermal pressure"

    This reverts commit 17e0388043.

commit afe91b6a918674b387deeb5ec201a3b7fd720c48
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:18 2024 +0300

    Revert "sched/topology: Add callback to read per CPU thermal pressure"

    This reverts commit 69693e21cc.

commit 760f1b0c528c1d2d8fe62be335d98caa74671f51
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:17 2024 +0300

    Revert "drivers/base/arch_topology: Add infrastructure to store and update instantaneous thermal pressure"

    This reverts commit 25a4c1f1a7.

commit 38dc404537a51df485fc1001a440d66f9e5bdfe3
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:17 2024 +0300

    Revert "arm64/topology: Populate arch_scale_thermal_pressure() for arm64 platforms"

    This reverts commit 27a4d5d4ed.

commit 9c1479b093688b09dcf386f019b34203c6dc1367
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:16 2024 +0300

    Revert "arm/topology: Populate arch_scale_thermal_pressure() for ARM platforms"

    This reverts commit 97e866391b.

commit 67039c6e35777ef753590f15f07190b9296a309c
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:16 2024 +0300

    Revert "sched/fair: Enable periodic update of average thermal pressure"

    This reverts commit 7ab5e58f6e.

commit 2a53cfee36bed22116d35da58ea944d5448910d8
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:15 2024 +0300

    Revert "BACKPORT: thermal: cpu_cooling: Update thermal pressure in case of a maximum frequency capping"

    This reverts commit 7d4b5af102.

commit 618bb2f007c30c047bd72ec4764f4da321041382
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:14 2024 +0300

    Revert "UPSTREAM: thermal: cpu_cooling: Update also offline CPUs per-cpu thermal_pressure"

    This reverts commit ee683c1595.

commit 57938e1686bc19a3455355bde6df5b339db46b1b
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:14 2024 +0300

    Revert "sched/fair: Enable tuning of decay period"

    This reverts commit a00583df62.

commit a06182dc6401475c7eb31aed0965d13ebc6881d1
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:13 2024 +0300

    Revert "init: Enable SCHED_THERMAL_PRESSURE by default"

    This reverts commit dcf24600af.

commit 9e5796f48feeb87d2dd5966d4bdffa2c78fd559d
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:12 2024 +0300

    Revert "sched:schedutil: Reland at @65b6f06b6a"

    This reverts commit f1a3787c1f.

commit 5a99475c6807fbd3f6a4b88fe18ef9853de5dbb8
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:12 2024 +0300

    Revert "thermal: cpu_cooling: fix backport"

    This reverts commit 84c18edb67.

commit 78ecfc16e24ee2da5b6f39a30c67d1134c60ee55
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:11 2024 +0300

    Revert "thermal/cpu-cooling, sched/core: Move the arch_set_thermal_pressure() API to generic scheduler code"

    This reverts commit b5c0f9489d.

commit 7df5eb08365aad663f6aff85e47dd87e986b4bd6
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:10 2024 +0300

    Revert "sched/cass: Fix CPU selection when no candidate CPUs are idle"

    This reverts commit 545493ec2d.

commit ee830b6597ce40cc38dee0486ba5b6628eee86a5
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:10 2024 +0300

    Revert "sched/cass: Clean up local variable scope in cass_best_cpu()"

    This reverts commit 03bd4ff101.

commit 298863d627468afad930e3483f8f594af047f149
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:09 2024 +0300

    Revert "sched/cass:checkout to kerneltoast/android_kernel_google_zuma@63f0b82d3"

    This reverts commit 5d04a85007.

commit 5caf8ee3242cc9e531093620e9d0edf34d2d9bd5
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:09 2024 +0300

    Revert "sched/cass: Perform runqueue selection for RT tasks too"

    This reverts commit 9752c4351d.

commit 99b278beedd77cf50fd59af702712b84dfdb6358
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:08 2024 +0300

    Revert "sched/cass: Fix suboptimal task placement when uclamp is used"

    This reverts commit e6b365c5b5.

commit 3b36702eb33d772c3730bdba422c84b2750d0d8e
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:07 2024 +0300

    Revert "sched/cass: Only treat sync waker CPU as idle if there's one task running"

    This reverts commit d12d7eee39.

commit 3037abe8c3c25cfc906ad34ea4155874167835cc
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:07 2024 +0300

    Revert "sched/cass: Eliminate redundant calls to smp_processor_id()"

    This reverts commit f686c22d10.

commit 8bef770149584f5336390367ce20ac8e0991e523
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:06 2024 +0300

    Revert "sched/cass:Adapt for our sources"

    This reverts commit 15e0e548dd.

commit cf47e4130be6df5b4581451dcf712c14ef545044
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:05 2024 +0300

    Revert "sched/fair: Set migration cost to zero to leverage DynamIQ Shared Unit"

    This reverts commit 553243db2f.

commit 9a02852fc864ea71b5d426b5b7aceeba8c72281c
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:05 2024 +0300

    Revert "sbalance: Allow IRQs to be moved off of excluded CPUs"

    This reverts commit cba4cee059.

commit 072fd97abf76bca1587d5a6a183125d8c2667c49
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:04 2024 +0300

    Revert "sbalance: Use a deferrable timer to avoid waking idle CPUs"

    This reverts commit 9efd9574cd.

commit d788e1caba4bf5bc5d3a81d2a96cfbd4d3b51381
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:03 2024 +0300

    Revert "sbalance: Use non-atomic cpumask_clear_cpu() variant"

    This reverts commit 11502f0b2b.

commit be608ef254b24bc0f5a807676bc2aa37a75c0c41
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:03 2024 +0300

    Revert "sbalance: Fix systemic issues caused by flawed IRQ statistics"

    This reverts commit dca8268454.

commit d99703fa29615b4a9f5b051742cd2f64ad65ef53
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:02 2024 +0300

    Revert "cpufreq: schedutil: Use the frequency below the target if they're close"

    This reverts commit 9a7a26c904.

commit 1cf28ec326ca586bd9b5a555e777c0ce49a9371c
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:02 2024 +0300

    Revert "raphael_defconfig: Reduce IRQ_SBALANCE_POLL_MSEC to default 3000"

    This reverts commit ab033d4dca.

commit 53ce1242f7c643c525b239c464ba61cfcef3f5ea
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:01 2024 +0300

    Revert "Revert "cpufreq: schedutil: Use the frequency below the target if they're close""

    This reverts commit 217db388c4.

commit 63f8122b844e194bc197c2a17a11871651017af6
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:00 2024 +0300

    Revert "Revert "cpufreq: schedutil: Extend limit tunables to prime""

    This reverts commit 056441ef45.

commit 09c2c5be085b6c10a6ab974b84404badc85ff393
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:28:00 2024 +0300

    Revert "Revert "cpufreq: schedutil: Expose limit tunables cluster-wise""

    This reverts commit d4c03145b6.

commit f3473acea98f745b142608e182b163c08110affe
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:27:59 2024 +0300

    Revert "raphael_defconfig:Exclude core 3 from IRQ balancing"

    This reverts commit 1b7085aae6.

commit b7ce57d5f7480430cb9e530a3a7206f2dd6456e5
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:27:58 2024 +0300

    Revert "cpufreq: schedutil: Use the frequency below the target if they're close"

    This reverts commit b64b535bea.

commit c99a62857fd75daafe6a6a76fdfe20242fb59d9d
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:27:57 2024 +0300

    Revert "cpufreq: schedutil: Set default up/down rate limits to 500/1000 us"

    This reverts commit 2dff5ad0ff.

commit f505c93a25bc5d741b52e2e219ddd86fd9022976
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:27:57 2024 +0300

    Revert "cpufreq: schedutil: Protect default up/down rate limits"

    This reverts commit d0164cdf9a.

commit b5b279845790b0b072e3abf3886d5430434d30f4
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:27:56 2024 +0300

    Revert "drm: Convert CPU-invariant uninterruptible waits to TASK_IDLE"

    This reverts commit a252ff2d01.

commit 9575adb91c2880a5dbeff1fec93aac44295f939a
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:27:55 2024 +0300

    Revert "pm_qos: Use SRCU notifier chains"

    This reverts commit 29d5065905.

commit 644abfbbd422a8f6aa9902839cfee155b3496c24
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:27:55 2024 +0300

    Revert "sched/completion: Expose wait_for_common*() to drivers"

    This reverts commit 483d30a710.

commit 1fc08c6cd803c0c24421d5d992ea4b16357aeb92
Author: kondors1995 <normandija1945@gmail.com>
Date:   Fri Apr 19 10:27:51 2024 +0300

    Revert "treewide: Fix warnings"

    This reverts commit abfb9659df.
2024-04-21 22:19:43 +03:00
Samuel Pascua
84c18edb67 thermal: cpu_cooling: fix backport
Signed-off-by: Samuel Pascua <pascua.samuel.14@gmail.com>
2024-04-06 13:00:43 +03:00
Thara Gopinath
17e0388043 sched/pelt: Add support to track thermal pressure
Extrapolating on the existing framework to track rt/dl utilization using
pelt signals, add a similar mechanism to track thermal pressure. The
difference here from rt/dl utilization tracking is that, instead of
tracking time spent by a CPU running a RT/DL task through util_avg, the
average thermal pressure is tracked through load_avg. This is because
thermal pressure signal is weighted time "delta" capacity unlike util_avg
which is binary. "delta capacity" here means delta between the actual
capacity of a CPU and the decreased capacity a CPU due to a thermal event.

In order to track average thermal pressure, a new sched_avg variable
avg_thermal is introduced. Function update_thermal_load_avg can be called
to do the periodic bookkeeping (accumulate, decay and average) of the
thermal pressure.

Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Thara Gopinath <thara.gopinath@linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lkml.kernel.org/r/20200222005213.3873-2-thara.gopinath@linaro.org
2024-04-06 13:00:42 +03:00
Peter Zijlstra
017d4c0519 sched/fair: Propagate an effective runnable_load_avg
The load balancer uses runnable_load_avg as load indicator. For
!cgroup this is:

  runnable_load_avg = \Sum se->avg.load_avg ; where se->on_rq

That is, a direct sum of all runnable tasks on that runqueue. As
opposed to load_avg, which is a sum of all tasks on the runqueue,
which includes a blocked component.

However, in the cgroup case, this comes apart since the group entities
are always runnable, even if most of their constituent entities are
blocked.

Therefore introduce a runnable_weight which for task entities is the
same as the regular weight, but for group entities is a fraction of
the entity weight and represents the runnable part of the group
runqueue.

Then propagate this load through the PELT hierarchy to arrive at an
effective runnable load avgerage -- which we should not confuse with
the canonical runnable load average.

Suggested-by: Tejun Heo <tj@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-04-06 13:00:41 +03:00
kondors1995
cd33b35b43 Merge branch '13.0' into dev/DSP 2023-05-01 21:47:37 +03:00
pwnrazr
617fd5077f Merge remote-tracking branch 'android-stable/android-4.14-stable' into dev-base 2023-05-01 13:06:48 +03:00
Douglas Raillard
fc4c7404ae f2fs: Fix f2fs_truncate_partial_nodes ftrace event
[ Upstream commit 0b04d4c0542e8573a837b1d81b94209e48723b25 ]

Fix the nid_t field so that its size is correctly reported in the text
format embedded in trace.dat files. As it stands, it is reported as
being of size 4:

        field:nid_t nid[3];     offset:24;      size:4; signed:0;

Instead of 12:

        field:nid_t nid[3];     offset:24;      size:12;        signed:0;

This also fixes the reported offset of subsequent fields so that they
match with the actual struct layout.

Signed-off-by: Douglas Raillard <douglas.raillard@arm.com>
Reviewed-by: Mukesh Ojha <quic_mojha@quicinc.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-04-26 11:18:56 +02:00
kondors1995
6abb77db34 Import EROFS from 80301c31dc
Squashed commit of the following:

commit 37695a77521cfccbf92840cc13dcc4d8cb7dda96
Author: pwnrazr <1644943+pwnrazr@users.noreply.github.com>
Date:   Thu Feb 16 00:00:20 2023 +0800

    raphael_defconfig: enable erofs highpri percpu kthread

commit 816e4801de2002f5f53e7cd2f7aea282755d5391
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Mon Mar 6 15:48:21 2023 -0500

    fs/(erofs || f2fs): drop WQ_UNBOUND

    Due to asym arm64 latency regression on WQ_UNBOUND

commit d0e5cb53f102962d0d40ff12f548542d71f6340e
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Wed Feb 15 10:44:37 2023 -0500

    erofs/zdata: modify set sched to use RR at high prio for lower latency

    Fixes: bdd668d3b54202

commit afc1c08015966909a27c9d3d53d8796e80c3e4ef
Author: Sandeep Dhavale <dhavale@google.com>
Date:   Wed Feb 8 06:53:49 2023 +0000

    [WIP] BACKPORT: FROMLIST: erofs: add per-cpu threads for decompression

    Using per-cpu thread pool we can reduce the scheduling latency compared
    to workqueue implementation. With this patch scheduling latency and
    variation is reduced as per-cpu threads are high priority kthread_workers.

    The results were evaluated on arm64 Android devices running 5.10 kernel.

    The table below shows resulting improvements of total scheduling latency
    for the same app launch benchmark runs with 50 iterations. Scheduling
    latency is the latency between when the task (workqueue kworker vs
    kthread_worker) became eligible to run to when it actually started
    running.
    +-------------------------+-----------+----------------+---------+
    |                         | workqueue | kthread_worker |  diff   |
    +-------------------------+-----------+----------------+---------+
    | Average (us)            |     15253 |           2914 | -80.89% |
    | Median (us)             |     14001 |           2912 | -79.20% |
    | Minimum (us)            |      3117 |           1027 | -67.05% |
    | Maximum (us)            |     30170 |           3805 | -87.39% |
    | Standard deviation (us) |      7166 |            359 |         |
    +-------------------------+-----------+----------------+---------+

    Background: Boot times and cold app launch benchmarks are very
    important to the android ecosystem as they directly translate to
    responsiveness from user point of view. While erofs provides
    a lot of important features like space savings, we saw some
    performance penalty in cold app launch benchmarks in few scenarios.
    Analysis showed that the significant variance was coming from the
    scheduling cost while decompression cost was more or less the same.

    Having per-cpu thread pool we can see from the above table that this
    variation is reduced by ~80% on average. This problem was discussed
    at LPC 2022. Link to LPC 2022 slides and
    talk at [1]

    [1] https://lpc.events/event/16/contributions/1338/

    Link: https://lore.kernel.org/lkml/Y+DP6V9fZG7XPPGy@debian/

    Change-Id: I454da5bc17f285d99047b93dc1fc70444f287156
    Signed-off-by: Sandeep Dhavale <dhavale@google.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 354d97368e8ffd832a43f6aa0d7c43f52268ca80
Author: pwnrazr <1644943+pwnrazr@users.noreply.github.com>
Date:   Sat May 7 13:21:24 2022 +0800

    sm8150: dtsi: remove barrier and discard mount options

commit 6c0b4a711ecb5b0e30c6115959b48af641e9b5bf
Author: pwnrazr <1644943+pwnrazr@users.noreply.github.com>
Date:   Sat May 7 13:20:47 2022 +0800

    Revert "arch: arm64: disable erofs"

    This reverts commit fe6fe5ef6107fc245ca50cd38f585e580fe2fc59.

commit 515b1441ad6ac0f9e1c74013cd80e9b30065edc0
Author: kondors1995 <normandija1945@gmail.com>
Date:   Wed Feb 8 16:43:29 2023 +0200

    Revert "raphael_defconfig: Revert FBEv2 defconfig changes"

    This reverts commit 97bb4a1d5d103804c72617481fca9b6cf93660a2.

commit c010e1a5176d73f3829ce49cfdb0fcc0ee5c777c
Author: Yue Hu <huyue2@coolpad.com>
Date:   Thu Apr 7 13:05:43 2022 +0800

    erofs: do not prompt for risk any more when using big pcluster

    The big pcluster feature has been merged for a year, it has been mostly
    stable now.

    Signed-off-by: Yue Hu <huyue2@coolpad.com>
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Link: https://lore.kernel.org/r/20220407050505.12683-1-huyue2@coolpad.com
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Signed-off-by: Cyber Knight <cyberknight755@gmail.com>

commit b135290ae7af3f5f7b69e24c6ca678c4f6572cf2
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Mon Jun 6 13:23:06 2022 -0400

    erofs: Squashed revert of some recent  backports:

    Keep out of release branch until
    d71eb1da8e8b59a7072c51ce48175e159ecfd79a is fixed, and also readmore
    decompress strategy is introduced.

    commit b9494371e2493f1a8ccc18b1c80f67867f6f623a
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:22:49 2022 -0400

        Revert "erofs: iomap support for non-tailpacking DIO"

        This reverts commit 804ddc92b769a9cc9926d0262725e6330d0f0a76.

    commit 0649a6ed5e759857aabc334abeddacbe4eac7859
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:22:41 2022 -0400

        Revert "erofs: adapt 3f4e33b91a28 to our tree"

        This reverts commit 016f1ffa36da74ab67ed99abd474a0b2da5133eb.

    commit a3704a5a79990f75c8336c9001939db6e6d21181
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:22:33 2022 -0400

        Revert "erofs: add support for the full decompressed length"

        This reverts commit a4a195b954114aeb741cf4f8b14256ed92e7c545.

    commit 5a506fe78d7624f1a94e60d0e3d7113ae6934ea7
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:22:27 2022 -0400

        Revert "erofs: add fiemap support with iomap"

        This reverts commit 07577933c3fb397791f113ad36fac7a061385826.

    commit dd93cf9efb3d1f9608780c44a50a860eb9921cf4
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:22:16 2022 -0400

        Revert "erofs: introduce chunk-based file on-disk format"

        This reverts commit 690f4dc6d3b27ed6278b8fbae20273883f616e56.

    commit a1846fe6257df43564f42eb153131796f3fd84ed
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:22:08 2022 -0400

        Revert "erofs: support reading chunk-based uncompressed files"

        This reverts commit 5bd83bfc55b6169af5bbf3c0ba4528577c2fa1ff.

    commit 3e1c2530db00b6605d8db09e207cb3633e61cdba
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:22:03 2022 -0400

        Revert "erofs: fix double free of 'copied'"

        This reverts commit c608a6f861e0d457d6c9a5905e8b3d928e672075.

    commit 7a9e0f351f8d41a01a0763316bbd4b6ace94bea0
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:21:52 2022 -0400

        Revert "erofs: fix misbehavior of unsupported chunk format check"

        This reverts commit 751e7c533e451b3c6a51f7d2a69224cca39e8c20.

    commit 37b05816e45d519643dd9d162b827311abf3b034
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:21:44 2022 -0400

        Revert "erofs: get compression algorithms directly on mapping"

        This reverts commit 98b09cde747826f6fe3aae50eb05659f7f2803f7.

    commit de74ca4af181a35ac037a44f07cf6a7e55e0f127
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:21:35 2022 -0400

        Revert "erofs: introduce the secondary compression head"

        This reverts commit feea4ee667bf5d5fa2c6d0c5f57697476dce7ca7.

    commit dda6e8eaddd3203cfafd6c82d2e751f2e6d16766
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:21:29 2022 -0400

        Revert "erofs: clean up z_erofs_extent_lookback"

        This reverts commit c08dbda40a4f3016ee6c60ae2a19e3ecc518361c.

    commit 2e5fd527a76eba733464b0ba71fe92abc839b62b
    Author: John Galt <johngaltfirstrun@gmail.com>
    Date:   Mon Jun 6 13:21:23 2022 -0400

        Revert "erofs: clean up erofs_map_blocks tracepoints"

        This reverts commit d71eb1da8e8b59a7072c51ce48175e159ecfd79a.

commit ed6e7f36515d6d80c75c4d0803b636e17f328a6c
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Thu Dec 9 09:29:18 2021 +0800

    erofs: clean up erofs_map_blocks tracepoints

    Since the new type of chunk-based files is introduced, there is no
    need to leave flatmode tracepoints.

    Rename to erofs_map_blocks instead.

    Link: https://lore.kernel.org/r/20211209012918.30337-1-hsiangkao@linux.alibaba.com
    Reviewed-by: Yue Hu <huyue2@yulong.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 525147ad9beef7e521c1667509db763e970c06d3
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Fri Mar 11 02:27:42 2022 +0800

    erofs: clean up z_erofs_extent_lookback

    Avoid the unnecessary tail recursion since it can be converted into
    a loop directly in order to prevent potential stack overflow.

    It's a pretty straightforward conversion.

    Link: https://lore.kernel.org/r/20220310182743.102365-1-hsiangkao@linux.alibaba.com
    Reviewed-by: Yue Hu <huyue2@coolpad.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit db45bcfb35a2cd8d49e159c0cc70635b713183a4
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Mon Oct 18 00:57:21 2021 +0800

    erofs: introduce the secondary compression head

    Previously, for each HEAD lcluster, it can be either HEAD or PLAIN
    lcluster to indicate whether the whole pcluster is compressed or not.

    In this patch, a new HEAD2 head type is introduced to specify another
    compression algorithm other than the primary algorithm for each
    compressed file, which can be used for upcoming LZMA compression and
    LZ4 range dictionary compression for various data patterns.

    It has been stayed in the EROFS roadmap for years. Complete it now!

    Link: https://lore.kernel.org/r/20211017165721.2442-1-xiang@kernel.org
    Reviewed-by: Yue Hu <huyue2@yulong.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit f0fe9e97d03ed484a51f764373ad0c5941949869
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Sat Oct 9 04:08:37 2021 +0800

    erofs: get compression algorithms directly on mapping

    Currently, z_erofs_map_blocks_iter() returns whether extents are
    compressed or not, and the decompression frontend gets the specific
    algorithms then.

    It works but not quite well in many aspests, for example:
     - The decompression frontend has to deal with whether extents are
       compressed or not again and lookup the algorithms if compressed.
       It's duplicated and too detailed about the on-disk mapping.

     - A new secondary compression head will be introduced later so that
       each file can have 2 compression algorithms at most for different
       type of data. It could increase the complexity of the decompression
       frontend if still handled in this way;

     - A new readmore decompression strategy will be introduced to get
       better performance for much bigger pcluster and lzma, which needs
       the specific algorithm in advance as well.

    Let's look up compression algorithms in z_erofs_map_blocks_iter()
    directly instead.

    Link: https://lore.kernel.org/r/20211008200839.24541-2-xiang@kernel.org
    Reviewed-by: Chao Yu <chao@kernel.org>
    Reviewed-by: Yue Hu <huyue2@yulong.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 588fc2156404c552d4c2c7bcc5def820966a1ba1
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Wed Sep 22 17:51:41 2021 +0800

    erofs: fix misbehavior of unsupported chunk format check

    Unsupported chunk format should be checked with
    "if (vi->chunkformat & ~EROFS_CHUNK_FORMAT_ALL)"

    Found when checking with 4k-byte blockmap (although currently mkfs
    uses inode chunk indexes format by default.)

    Link: https://lore.kernel.org/r/20210922095141.233938-1-hsiangkao@linux.alibaba.com
    Fixes: c5aa903a59db ("erofs: support reading chunk-based uncompressed files")
    Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 613122535bafaabb0e58a9c347c5b6f1b8e6fa91
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Wed Aug 25 20:07:57 2021 +0800

    erofs: fix double free of 'copied'

    Dan reported a new smatch warning [1]
    "fs/erofs/inode.c:210 erofs_read_inode() error: double free of 'copied'"

    Due to new chunk-based format handling logic, the error path can be
    called after kfree(copied).

    Set "copied = NULL" after kfree(copied) to fix this.

    [1] https://lore.kernel.org/r/202108251030.bELQozR7-lkp@intel.com

    Link: https://lore.kernel.org/r/20210825120757.11034-1-hsiangkao@linux.alibaba.com
    Fixes: c5aa903a59db ("erofs: support reading chunk-based uncompressed files")
    Reported-by: kernel test robot <lkp@intel.com>
    Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 7b648f684ea7c99deab7278f0c2cbbf74797a56d
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Fri Aug 20 18:00:19 2021 +0800

    erofs: support reading chunk-based uncompressed files

    Add runtime support for chunk-based uncompressed files
    described in the previous patch.

    Link: https://lore.kernel.org/r/20210820100019.208490-2-hsiangkao@linux.alibaba.com
    Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit d9737546275a3c460177a3ce9e01096bc3cfc3ad
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Fri Aug 20 18:00:18 2021 +0800

    erofs: introduce chunk-based file on-disk format

    Currently, uncompressed data except for tail-packing inline is
    consecutive on disk.

    In order to support chunk-based data deduplication, add a new
    corresponding inode data layout.

    In the future, the data source of chunks can be either (un)compressed.

    Link: https://lore.kernel.org/r/20210820100019.208490-1-hsiangkao@linux.alibaba.com
    Reviewed-by: Liu Bo <bo.liu@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 47f6bed39a7a83aa59be667657cba886dbd4b79b
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Fri Aug 13 13:29:31 2021 +0800

    erofs: add fiemap support with iomap

    This adds fiemap support for both uncompressed files and compressed
    files by using iomap infrastructure.

    Link: https://lore.kernel.org/r/20210813052931.203280-3-hsiangkao@linux.alibaba.com
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 82cc95ee585c9b033a43b0564173d4c444e3a4ac
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Wed Aug 18 23:22:31 2021 +0800

    erofs: add support for the full decompressed length

    Previously, there is no need to get the full decompressed length since
    EROFS supports partial decompression. However for some other cases
    such as fiemap, the full decompressed length is necessary for iomap to
    make it work properly.

    This patch adds a way to get the full decompressed length. Note that
    it takes more metadata overhead and it'd be avoided if possible in the
    performance sensitive scenario.

    Link: https://lore.kernel.org/r/20210818152231.243691-1-hsiangkao@linux.alibaba.com
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 8ff30ee6aaa1130bc26af4a98a818d91820c0bdb
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Thu May 12 12:08:04 2022 -0400

    erofs: adapt 3f4e33b91a28 to our tree

commit 71e2f8865698e382349a16d8f90e5d74f935ff2a
Author: Huang Jianan <huangjianan@oppo.com>
Date:   Thu Aug 5 08:35:59 2021 +0800

    erofs: iomap support for non-tailpacking DIO

    Add iomap support for non-tailpacking uncompressed data in order to
    support DIO and DAX.

    Direct I/O is useful in certain scenarios for uncompressed files.
    For example, double pagecache can be avoid by direct I/O when
    loop device is used for uncompressed files containing upper layer
    compressed filesystem.

    This adds iomap DIO support for non-tailpacking cases first and
    tail-packing inline files are handled in the follow-up patch.

    Link: https://lore.kernel.org/r/20210805003601.183063-2-hsiangkao@linux.alibaba.com
    Cc: linux-fsdevel@vger.kernel.org
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Huang Jianan <huangjianan@oppo.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 8bc571a229c3701405ac47f689db283ac99f2b2d
Author: Goldwyn Rodrigues <rgoldwyn@suse.com>
Date:   Fri Aug 30 12:09:24 2019 -0500

    fs: export generic_file_buffered_read()

    Export generic_file_buffered_read() to be used to supplement incomplete
    direct reads.

    Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
    Reviewed-by: David Sterba <dsterba@suse.com>
    Signed-off-by: David Sterba <dsterba@suse.com>

commit 34c8cbbc7b932ac50e90da6e838524fd1f162aca
Author: Dan Williams <dan.j.williams@intel.com>
Date:   Wed Mar 7 15:26:44 2018 -0800

    fs, dax: prepare for dax-specific address_space_operations

    In preparation for the dax implementation to start associating dax pages
    to inodes via page->mapping, we need to provide a 'struct
    address_space_operations' instance for dax. Define some generic VFS aops
    helpers for dax. These noop implementations are there in the dax case to
    prevent the VFS from falling back to operations with page-cache
    assumptions, dax_writeback_mapping_range() may not be referenced in the
    FS_DAX=n case.

    Cc: Jeff Moyer <jmoyer@redhat.com>
    Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
    Suggested-by: Matthew Wilcox <mawilcox@microsoft.com>
    Suggested-by: Jan Kara <jack@suse.cz>
    Suggested-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Reviewed-by: Jan Kara <jack@suse.cz>
    Suggested-by: Dave Chinner <david@fromorbit.com>
    Signed-off-by: Dan Williams <dan.j.williams@intel.com>

commit b0da008763834f165e8a055e011f223b3981316d
Author: Andreas Gruenbacher <agruenba@redhat.com>
Date:   Sun Oct 1 17:55:54 2017 -0400

    iomap: Switch from blkno to disk offset

    Replace iomap->blkno, the sector number, with iomap->addr, the disk
    offset in bytes.  For invalid disk offsets, use the special value
    IOMAP_NULL_ADDR instead of IOMAP_NULL_BLOCK.

    This allows to use iomap for mappings which are not block aligned, such
    as inline data on ext4.

    Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>  # iomap, xfs
    Reviewed-by: Jan Kara <jack@suse.cz>

commit b74997cce993dd0408a0beeb36bd28652e272108
Author: Matthew Wilcox <mawilcox@microsoft.com>
Date:   Tue Nov 28 15:39:51 2017 -0500

    idr: Rename idr_for_each_entry_ext

    Most places in the kernel that we need to distinguish functions by the
    type of their arguments, we use '_ul' as a suffix for the unsigned long
    variant, not '_ext'.  Also add kernel-doc.

    Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

commit a562faeba73cfb13de1f278c95be606faa3e4f21
Author: Matthew Wilcox <mawilcox@microsoft.com>
Date:   Tue Nov 28 10:14:27 2017 -0500

    idr: Add idr_alloc_u32 helper

    All current users of idr_alloc_ext() actually want to allocate a u32
    and idr_alloc_u32() fits their needs better.

    Like idr_get_next(), it uses a 'nextid' argument which serves as both
    a pointer to the start ID and the assigned ID (instead of a separate
    minimum and pointer-to-assigned-ID argument).  It uses a 'max' argument
    rather than 'end' because the semantics that idr_alloc has for 'end'
    don't work well for unsigned types.

    Since idr_alloc_u32() returns an errno instead of the allocated ID, mark
    it as __must_check to help callers use it correctly.  Include copious
    kernel-doc.  Chris Mi <chrism@mellanox.com> has promised to contribute
    test-cases for idr_alloc_u32.

    Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>

commit 4b24e4564260899c64b9532440a9b5545dbfe7f9
Author: Matthew Wilcox <mawilcox@microsoft.com>
Date:   Tue Apr 10 16:36:48 2018 -0700

    fscache: use appropriate radix tree accessors

    Don't open-code accesses to data structure internals.

    Link: http://lkml.kernel.org/r/20180313132639.17387-7-willy@infradead.org
    Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
    Reviewed-by: Jeff Layton <jlayton@redhat.com>
    Cc: Darrick J. Wong <darrick.wong@oracle.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit 7469480be01c3394807cbd0991f06b8d6f2d4403
Author: Matthew Wilcox <mawilcox@microsoft.com>
Date:   Tue Apr 10 16:36:44 2018 -0700

    export __set_page_dirty

    XFS currently contains a copy-and-paste of __set_page_dirty().  Export
    it from buffer.c instead.

    Link: http://lkml.kernel.org/r/20180313132639.17387-6-willy@infradead.org
    Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
    Acked-by: Jeff Layton <jlayton@kernel.org>
    Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
    Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit c53045287025992bc775081dbab63ac926a597e8
Author: Matthew Wilcox <mawilcox@microsoft.com>
Date:   Tue Apr 10 16:36:28 2018 -0700

    radix tree: use GFP_ZONEMASK bits of gfp_t for flags

    Patch series "XArray", v9.  (First part thereof).

    This patchset is, I believe, appropriate for merging for 4.17.  It
    contains the XArray implementation, to eventually replace the radix
    tree, and converts the page cache to use it.

    This conversion keeps the radix tree and XArray data structures in sync
    at all times.  That allows us to convert the page cache one function at
    a time and should allow for easier bisection.  Other than renaming some
    elements of the structures, the data structures are fundamentally
    unchanged; a radix tree walk and an XArray walk will touch the same
    number of cachelines.  I have changes planned to the XArray data
    structure, but those will happen in future patches.

    Improvements the XArray has over the radix tree:

     - The radix tree provides operations like other trees do; 'insert' and
       'delete'. But what most users really want is an automatically
       resizing array, and so it makes more sense to give users an API that
       is like an array -- 'load' and 'store'. We still have an 'insert'
       operation for users that really want that semantic.

     - The XArray considers locking as part of its API. This simplifies a
       lot of users who formerly had to manage their own locking just for
       the radix tree. It also improves code generation as we can now tell
       RCU that we're holding a lock and it doesn't need to generate as much
       fencing code. The other advantage is that tree nodes can be moved
       (not yet implemented).

     - GFP flags are now parameters to calls which may need to allocate
       memory. The radix tree forced users to decide what the allocation
       flags would be at creation time. It's much clearer to specify them at
       allocation time.

     - Memory is not preloaded; we don't tie up dozens of pages on the off
       chance that the slab allocator fails. Instead, we drop the lock,
       allocate a new node and retry the operation. We have to convert all
       the radix tree, IDA and IDR preload users before we can realise this
       benefit, but I have not yet found a user which cannot be converted.

     - The XArray provides a cmpxchg operation. The radix tree forces users
       to roll their own (and at least four have).

     - Iterators take a 'max' parameter. That simplifies many users and will
       reduce the amount of iteration done.

     - Iteration can proceed backwards. We only have one user for this, but
       since it's called as part of the pagefault readahead algorithm, that
       seemed worth mentioning.

     - RCU-protected pointers are not exposed as part of the API. There are
       some fun bugs where the page cache forgets to use rcu_dereference()
       in the current codebase.

     - Value entries gain an extra bit compared to radix tree exceptional
       entries. That gives us the extra bit we need to put huge page swap
       entries in the page cache.

     - Some iterators now take a 'filter' argument instead of having
       separate iterators for tagged/untagged iterations.

    The page cache is improved by this:

     - Shorter, easier to read code

     - More efficient iterations

     - Reduction in size of struct address_space

     - Fewer walks from the top of the data structure; the XArray API
       encourages staying at the leaf node and conducting operations there.

    This patch (of 8):

    None of these bits may be used for slab allocations, so we can use them
    as radix tree flags as long as we mask them off before passing them to
    the slab allocator. Move the IDR flag from the high bits to the
    GFP_ZONEMASK bits.

    Link: http://lkml.kernel.org/r/20180313132639.17387-3-willy@infradead.org
    Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com>
    Acked-by: Jeff Layton <jlayton@kernel.org>
    Cc: Darrick J. Wong <darrick.wong@oracle.com>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
    Cc: Will Deacon <will.deacon@arm.com>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

commit c95250f9f545568b87775f1a2a48d412203161f7
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Mon May 16 10:45:14 2022 -0400

    Revert "erofs: compression fixes"

    This reverts commit 208dabff2d5e3e616a86df8bdba814d54b1a8a1f.

    Fixes a deadlock when fix shrinking erofs slab.

commit d07627505cd871bb1a539377434dede2f4a18d9c
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Mon May 16 09:41:14 2022 -0400

    Revert "erofs: fixes for compilation"

    This reverts commit c7bf11979051cda0e7b37857289503fa4831c549.

commit 7846d0f267ba3572570917e4880d60c79939bf5c
Author: Hongyu Jin <hongyu.jin@unisoc.com>
Date:   Fri Apr 1 19:55:27 2022 +0800

    erofs: fix use-after-free of on-stack io[]

    The root cause is the race as follows:
    Thread #1                              Thread #2(irq ctx)

    z_erofs_runqueue()
      struct z_erofs_decompressqueue io_A[];
      submit bio A
      z_erofs_decompress_kickoff(,,1)
                                           z_erofs_decompressqueue_endio(bio A)
                                           z_erofs_decompress_kickoff(,,-1)
                                           spin_lock_irqsave()
                                           atomic_add_return()
      io_wait_event()	-> pending_bios is already 0
      [end of function]
                                           wake_up_locked(io_A[]) // crash

    Referenced backtrace in kernel 5.4:

    [   10.129422] Unable to handle kernel paging request at virtual address eb0454a4
    [   10.364157] CPU: 0 PID: 709 Comm: getprop Tainted: G        WC O      5.4.147-ab09225 #1
    [   11.556325] [<c01b33b8>] (__wake_up_common) from [<c01b3300>] (__wake_up_locked+0x40/0x48)
    [   11.565487] [<c01b3300>] (__wake_up_locked) from [<c044c8d0>] (z_erofs_vle_unzip_kickoff+0x6c/0xc0)
    [   11.575438] [<c044c8d0>] (z_erofs_vle_unzip_kickoff) from [<c044c854>] (z_erofs_vle_read_endio+0x16c/0x17c)
    [   11.586082] [<c044c854>] (z_erofs_vle_read_endio) from [<c06a80e8>] (clone_endio+0xb4/0x1d0)
    [   11.595428] [<c06a80e8>] (clone_endio) from [<c04a1280>] (blk_update_request+0x150/0x4dc)
    [   11.604516] [<c04a1280>] (blk_update_request) from [<c06dea28>] (mmc_blk_cqe_complete_rq+0x144/0x15c)
    [   11.614640] [<c06dea28>] (mmc_blk_cqe_complete_rq) from [<c04a5d90>] (blk_done_softirq+0xb0/0xcc)
    [   11.624419] [<c04a5d90>] (blk_done_softirq) from [<c010242c>] (__do_softirq+0x184/0x56c)
    [   11.633419] [<c010242c>] (__do_softirq) from [<c01051e8>] (irq_exit+0xd4/0x138)
    [   11.641640] [<c01051e8>] (irq_exit) from [<c010c314>] (__handle_domain_irq+0x94/0xd0)
    [   11.650381] [<c010c314>] (__handle_domain_irq) from [<c04fde70>] (gic_handle_irq+0x50/0xd4)
    [   11.659641] [<c04fde70>] (gic_handle_irq) from [<c0101b70>] (__irq_svc+0x70/0xb0)

    Signed-off-by: Hongyu Jin <hongyu.jin@unisoc.com>
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Link: https://lore.kernel.org/r/20220401115527.4935-1-hongyu.jin.cn@gmail.com
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 9fa705504bf016a360c10edc3c9c5cbf8d870a78
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Thu May 5 22:40:43 2022 -0400

    erofs: extend 3812dc21ec

commit 4cda8c8c3d0ea4b3cb0f660db01697b50f7bfddc
Author: Yue Hu <huyue2@yulong.com>
Date:   Thu Oct 14 14:57:44 2021 +0800

    erofs: remove the fast path of per-CPU buffer decompression

    As Xiang mentioned, such path has no real impact to our current
    decompression strategy, remove it directly. Also, update the return
    value of z_erofs_lz4_decompress() to 0 if success to keep consistent
    with LZMA which will return 0 as well for that case.

    Link: https://lore.kernel.org/r/20211014065744.1787-1-zbestahu@gmail.com
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Signed-off-by: Yue Hu <huyue2@yulong.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 20122adf7721eff6c6ff90db545e0597501d942f
Author: Yue Hu <huyue2@yulong.com>
Date:   Tue Sep 14 11:59:15 2021 +0800

    erofs: clear compacted_2b if compacted_4b_initial > totalidx

    Currently, the whole indexes will only be compacted 4B if
    compacted_4b_initial > totalidx. So, the calculated compacted_2b
    is worthless for that case. It may waste CPU resources.

    No need to update compacted_4b_initial as mkfs since it's used to
    fulfill the alignment of the 1st compacted_2b pack and would handle
    the case above.

    We also need to clarify compacted_4b_end here. It's used for the
    last lclusters which aren't fitted in the previous compacted_2b
    packs.

    Some messages are from Xiang.

    Link: https://lore.kernel.org/r/20210914035915.1190-1-zbestahu@gmail.com
    Signed-off-by: Yue Hu <huyue2@yulong.com>
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    [ Gao Xiang: it's enough to use "compacted_4b_initial < totalidx". ]
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 3243783e85d10ccc00b9e8cb37960ed1fc1e9fef
Author: Yue Hu <huyue2@yulong.com>
Date:   Tue Aug 10 15:24:16 2021 +0800

    erofs: remove the mapping parameter from erofs_try_to_free_cached_page()

    The mapping is not used at all, remove it and update related code.

    Link: https://lore.kernel.org/r/20210810072416.1392-1-zbestahu@gmail.com
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Yue Hu <huyue2@yulong.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 2936d3798b6c340459813a0eeb2409a4cb34e44f
Author: Yue Hu <huyue2@yulong.com>
Date:   Tue Aug 10 14:54:50 2021 +0800

    erofs: directly use wrapper erofs_page_is_managed() when shrinking

    We already have the wrapper function to identify managed page.

    Link: https://lore.kernel.org/r/20210810065450.1320-1-zbestahu@gmail.com
    Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Reviewed-by: Chao Yu <chao@kernel.org>
    Signed-off-by: Yue Hu <huyue2@yulong.com>
    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>

commit 09b3effb67cdec2ce718d83a363c0a2df5f3d372
Author: Yue Hu <huyue2@yulong.com>
Date:   Mon Apr 19 18:26:23 2021 +0800

    erofs: remove the occupied parameter from z_erofs_pagevec_enqueue()

    No any behavior to variable occupied in z_erofs_attach_page() which
    is only caller to z_erofs_pagevec_enqueue().

    Link: https://lore.kernel.org/r/20210419102623.2015-1-zbestahu@gmail.com
    Signed-off-by: Yue Hu <huyue2@yulong.com>
    Reviewed-by: Gao Xiang <xiang@kernel.org>
    Signed-off-by: Gao Xiang <xiang@kernel.org>

commit b5b28aefcf024c86c3f930293ba36482f96faf34
Author: Gao Xiang <xiang@kernel.org>
Date:   Mon May 10 14:47:15 2021 +0800

    erofs: fix 1 lcluster-sized pcluster for big pcluster

    If the 1st NONHEAD lcluster of a pcluster isn't CBLKCNT lcluster type
    rather than a HEAD or PLAIN type instead, which means its pclustersize
    _must_ be 1 lcluster (since its uncompressed size < 2 lclusters),
    as illustrated below:

           HEAD     HEAD / PLAIN    lcluster type
       ____________ ____________
      |_:__________|_________:__|   file data (uncompressed)
       .                .
      .____________.
      |____________|                pcluster data (compressed)

    Such on-disk case was explained before [1] but missed to be handled
    properly in the runtime implementation.

    It can be observed if manually generating 1 lcluster-sized pcluster
    with 2 lclusters (thus CBLKCNT doesn't exist.) Let's fix it now.

    [1] https://lore.kernel.org/r/20210407043927.10623-1-xiang@kernel.org

    Link: https://lore.kernel.org/r/20210510064715.29123-1-xiang@kernel.org
    Fixes: cec6e93beadf ("erofs: support parsing big pcluster compress indexes")
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <xiang@kernel.org>

commit 2cfa0bcf32db1431e18d636e0ff5c592768b9620
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:27 2021 +0800

    erofs: enable big pcluster feature

    Enable COMPR_CFGS and BIG_PCLUSTER since the implementations are
    all settled properly.

    Link: https://lore.kernel.org/r/20210407043927.10623-11-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit d75144d8d0395bca0a1a629a3b9ab6a95112a083
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:26 2021 +0800

    erofs: support decompress big pcluster for lz4 backend

    Prior to big pcluster, there was only one compressed page so it'd
    easy to map this. However, when big pcluster is enabled, more work
    needs to be done to handle multiple compressed pages. In detail,

     - (maptype 0) if there is only one compressed page + no need
       to copy inplace I/O, just map it directly what we did before;

     - (maptype 1) if there are more compressed pages + no need to
       copy inplace I/O, vmap such compressed pages instead;

     - (maptype 2) if inplace I/O needs to be copied, use per-CPU
       buffers for decompression then.

    Another thing is how to detect inplace decompression is feasable or
    not (it's still quite easy for non big pclusters), apart from the
    inplace margin calculation, inplace I/O page reusing order is also
    needed to be considered for each compressed page. Currently, if the
    compressed page is the xth page, it shouldn't be reused as [0 ...
    nrpages_out - nrpages_in + x], otherwise a full copy will be triggered.

    Although there are some extra optimization ideas for this, I'd like
    to make big pcluster work correctly first and obviously it can be
    further optimized later since it has nothing with the on-disk format
    at all.

    Link: https://lore.kernel.org/r/20210407043927.10623-10-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit f344f71c42af2866c748ae22e1b133a02594b367
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:25 2021 +0800

    erofs: support parsing big pcluster compact indexes

    Different from non-compact indexes, several lclusters are packed
    as the compact form at once and an unique base blkaddr is stored for
    each pack, so each lcluster index would take less space on avarage
    (e.g. 2 bytes for COMPACT_2B.) btw, that is also why BIG_PCLUSTER
    switch should be consistent for compact head0/1.

    Prior to big pcluster, the size of all pclusters was 1 lcluster.
    Therefore, when a new HEAD lcluster was scanned, blkaddr would be
    bumped by 1 lcluster. However, that way doesn't work anymore for
    big pcluster since we actually don't know the compressed size of
    pclusters in advance (before reading CBLKCNT lcluster).

    So, instead, let blkaddr of each pack be the first pcluster blkaddr
    with a valid CBLKCNT, in detail,

     1) if CBLKCNT starts at the pack, this first valid pcluster is
        itself, e.g.
      _____________________________________________________________
     |_CBLKCNT0_|_NONHEAD_| .. |_HEAD_|_CBLKCNT1_| ... |_HEAD_| ...
     ^ = blkaddr base          ^ += CBLKCNT0           ^ += CBLKCNT1

     2) if CBLKCNT doesn't start at the pack, the first valid pcluster
        is the next pcluster, e.g.
      _________________________________________________________
     | NONHEAD_| .. |_HEAD_|_CBLKCNT0_| ... |_HEAD_|_HEAD_| ...
                    ^ = blkaddr base        ^ += CBLKCNT0
                                                   ^ += 1

    When a CBLKCNT is found, blkaddr will be increased by CBLKCNT
    lclusters, or a new HEAD is found immediately, bump blkaddr by 1
    instead (see the picture above.)

    Also noted if CBLKCNT is the end of the pack, instead of storing
    delta1 (distance of the next HEAD lcluster) as normal NONHEADs,
    it still uses the compressed block count (delta0) since delta1
    can be calculated indirectly but the block count can't.

    Adjust decoding logic to fit big pcluster compact indexes as well.

    Link: https://lore.kernel.org/r/20210407043927.10623-9-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 7af2a5cf065073d6f43298b2c96676f9315709d5
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:24 2021 +0800

    erofs: support parsing big pcluster compress indexes

    When INCOMPAT_BIG_PCLUSTER sb feature is enabled, legacy compress indexes
    will also have the same on-disk header compact indexes to keep per-file
    configurations instead of leaving it zeroed.

    If ADVISE_BIG_PCLUSTER is set for a file, CBLKCNT will be loaded for each
    pcluster in this file by parsing 1st non-head lcluster.

    Link: https://lore.kernel.org/r/20210407043927.10623-8-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 81a0c5100c6b09b91b7cfdad429fc66d65335be2
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:23 2021 +0800

    erofs: adjust per-CPU buffers according to max_pclusterblks

    Adjust per-CPU buffers on demand since big pcluster definition is
    available. Also, bail out unsupported pcluster size according to
    Z_EROFS_PCLUSTER_MAX_SIZE.

    Link: https://lore.kernel.org/r/20210407043927.10623-7-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 56612c78a9aeefc38d6b9bd7a6fef06eebe0c4b6
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:22 2021 +0800

    erofs: add big physical cluster definition

    Big pcluster indicates the size of compressed data for each physical
    pcluster is no longer fixed as block size, but could be more than 1
    block (more accurately, 1 logical pcluster)

    When big pcluster feature is enabled for head0/1, delta0 of the 1st
    non-head lcluster index will keep block count of this pcluster in
    lcluster size instead of 1. Or, the compressed size of pcluster
    should be 1 lcluster if pcluster has no non-head lcluster index.

    Also note that BIG_PCLUSTER feature reuses COMPR_CFGS feature since
    it depends on COMPR_CFGS and will be released together.

    Link: https://lore.kernel.org/r/20210407043927.10623-6-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit a67309917444753f1cebfee2d2503cf68269e54a
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:21 2021 +0800

    erofs: fix up inplace I/O pointer for big pcluster

    When picking up inplace I/O pages, it should be traversed in reverse
    order in aligned with the traversal order of file-backed online pages.
    Also, index should be updated together when preloading compressed pages.

    Previously, only page-sized pclustersize was supported so no problem
    at all. Also rename `compressedpages' to `icpage_ptr' to reflect its
    functionality.

    Link: https://lore.kernel.org/r/20210407043927.10623-5-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 8fabf77d1a435d68b2bbb89c51f8351ef8efed26
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:20 2021 +0800

    erofs: introduce physical cluster slab pools

    Since multiple pcluster sizes could be used at once, the number of
    compressed pages will become a variable factor. It's necessary to
    introduce slab pools rather than a single slab cache now.

    This limits the pclustersize to 1M (Z_EROFS_PCLUSTER_MAX_SIZE), and
    get rid of the obsolete EROFS_FS_CLUSTER_PAGE_LIMIT, which has no
    use now.

    Link: https://lore.kernel.org/r/20210407043927.10623-4-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit c9b891a3fd81d315815f496f1282c95e98507812
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Sat Apr 10 03:06:30 2021 +0800

    erofs: introduce multipage per-CPU buffers

    To deal the with the cases which inplace decompression is infeasible
    for some inplace I/O. Per-CPU buffers was introduced to get rid of page
    allocation latency and thrash for low-latency decompression algorithms
    such as lz4.

    For the big pcluster feature, introduce multipage per-CPU buffers to
    keep such inplace I/O pclusters temporarily as well but note that
    per-CPU pages are just consecutive virtually.

    When a new big pcluster fs is mounted, its max pclustersize will be
    read and per-CPU buffers can be growed if needed. Shrinking adjustable
    per-CPU buffers is more complex (because we don't know if such size
    is still be used), so currently just release them all when unloading.

    Link: https://lore.kernel.org/r/20210409190630.19569-1-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 6751c7549b38cfe2044fc3d6e03c25c0067e700d
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Apr 7 12:39:18 2021 +0800

    erofs: reserve physical_clusterbits[]

    Formal big pcluster design is actually more powerful / flexable than
    the previous thought whose pclustersize was fixed as power-of-2 blocks,
    which was obviously inefficient and space-wasting. Instead, pclustersize
    can now be set independently for each pcluster, so various pcluster
    sizes can also be used together in one file if mkfs wants (for example,
    according to data type and/or compression ratio).

    Let's get rid of previous physical_clusterbits[] setting (also notice
    that corresponding on-disk fields are still 0 for now). Therefore,
    head1/2 can be used for at most 2 different algorithms in one file and
    again pclustersize is now independent of these.

    Link: https://lore.kernel.org/r/20210407043927.10623-2-xiang@kernel.org
    Acked-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 7c717bd2fb96a7ee82346bc88ddd28c5812c689d
Author: Ruiqi Gong <gongruiqi1@huawei.com>
Date:   Wed Mar 31 05:39:20 2021 -0400

    erofs: Clean up spelling mistakes found in fs/erofs

    zmap.c: s/correspoinding/corresponding
    zdata.c: s/endding/ending

    Link: https://lore.kernel.org/r/20210331093920.31923-1-gongruiqi1@huawei.com
    Reported-by: Hulk Robot <hulkci@huawei.com>
    Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com>
    Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 44f277dee13de691fe1fc483b55b4bc8ade3da36
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Mon Mar 29 18:00:12 2021 +0800

    erofs: add on-disk compression configurations

    Add a bitmap for available compression algorithms and a variable-sized
    on-disk table for compression options in preparation for upcoming big
    pcluster and LZMA algorithm, which follows the end of super block.

    To parse the compression options, the bitmap is scanned one by one.
    For each available algorithm, there is data followed by 2-byte `length'
    correspondingly (it's enough for most cases, or entire fs blocks should
    be used.)

    With such available algorithm bitmap, kernel itself can also refuse to
    mount such filesystem if any unsupported compression algorithm exists.

    Note that COMPR_CFGS feature will be enabled with BIG_PCLUSTER.

    Link: https://lore.kernel.org/r/20210329100012.12980-1-hsiangkao@aol.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit e43a280cd5ca073e9d8cfa0471cdabf8f8500181
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Mon Mar 29 09:23:07 2021 +0800

    erofs: introduce on-disk lz4 fs configurations

    Introduce z_erofs_lz4_cfgs to store all lz4 configurations.
    Currently it's only max_distance, but will be used for new
    features later.

    Link: https://lore.kernel.org/r/20210329012308.28743-4-hsiangkao@aol.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit d4108bf277b411bfdfa0eb12c2172b4035471d8b
Author: Huang Jianan <huangjianan@oppo.com>
Date:   Mon Mar 29 09:23:06 2021 +0800

    erofs: support adjust lz4 history window size

    lz4 uses LZ4_DISTANCE_MAX to record history preservation. When
    using rolling decompression, a block with a higher compression
    ratio will cause a larger memory allocation (up to 64k). It may
    cause a large resource burden in extreme cases on devices with
    small memory and a large number of concurrent IOs. So appropriately
    reducing this value can improve performance.

    Decreasing this value will reduce the compression ratio (except
    when input_size <LZ4_DISTANCE_MAX). But considering that erofs
    currently only supports 4k output, reducing this value will not
    significantly reduce the compression benefits.

    The maximum value of LZ4_DISTANCE_MAX defined by lz4 is 64k, and
    we can only reduce this value. For the old kernel, it just can't
    reduce the memory allocation during rolling decompression without
    affecting the decompression result.

    Link: https://lore.kernel.org/r/20210329012308.28743-3-hsiangkao@aol.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Huang Jianan <huangjianan@oppo.com>
    Signed-off-by: Guo Weichao <guoweichao@oppo.com>
    [ Gao Xiang: introduce struct erofs_sb_lz4_info for configurations. ]
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 89a30917b8f584f34216b053ff5e4b8e1fa1a81a
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Mon Mar 29 09:23:05 2021 +0800

    erofs: introduce erofs_sb_has_xxx() helpers

    Introduce erofs_sb_has_xxx() to make long checks short, especially
    for later big pcluster & LZMA features.

    Link: https://lore.kernel.org/r/20210329012308.28743-2-hsiangkao@aol.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 83849318acff8125846f2447ed318f80db4dde38
Author: Yue Hu <huyue2@yulong.com>
Date:   Thu Mar 25 15:10:08 2021 +0800

    erofs: don't use erofs_map_blocks() any more

    Currently, erofs_map_blocks() will be called only from
    erofs_{bmap, read_raw_page} which are all for uncompressed files.
    So, the compression branch in erofs_map_blocks() is pointless. Let's
    remove it and use erofs_map_blocks_flatmode() directly. Also update
    related comments.

    Link: https://lore.kernel.org/r/20210325071008.573-1-zbestahu@gmail.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Yue Hu <huyue2@yulong.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit dd3b7a71fb79a620a8df1138d74c990df27e04a5
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Mon Mar 22 02:32:27 2021 +0800

    erofs: complete a missing case for inplace I/O

    Add a missing case which could cause unnecessary page allocation but
    not directly use inplace I/O instead, which increases runtime extra
    memory footprint.

    The detail is, considering an online file-backed page, the right half
    of the page is chosen to be cached (e.g. the end page of a readahead
    request) and some of its data doesn't exist in managed cache, so the
    pcluster will be definitely kept in the submission chain. (IOWs, it
    cannot be decompressed without I/O, e.g., due to the bypass queue).

    Currently, DELAYEDALLOC/TRYALLOC cases can be downgraded as NOINPLACE,
    and stop online pages from inplace I/O. After this patch, unneeded page
    allocations won't be observed in pickup_page_for_submission() then.

    Link: https://lore.kernel.org/r/20210321183227.5182-1-hsiangkao@aol.com
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 2195652f604a78eff0b808c94f7c31c0648d42e8
Author: Huang Jianan <huangjianan@oppo.com>
Date:   Wed Mar 17 11:54:47 2021 +0800

    erofs: use workqueue decompression for atomic contexts only

    z_erofs_decompressqueue_endio may not be executed in the atomic
    context, for example, when dm-verity is turned on. In this scenario,
    data can be decompressed directly to get rid of additional kworker
    scheduling overhead.

    Link: https://lore.kernel.org/r/20210317035448.13921-2-huangjianan@oppo.com
    Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Huang Jianan <huangjianan@oppo.com>
    Signed-off-by: Guo Weichao <guoweichao@oppo.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 50a12c462dbc5e3e4d14dc427392fc8e571b1b0b
Author: Huang Jianan <huangjianan@oppo.com>
Date:   Tue Mar 16 11:15:14 2021 +0800

    erofs: avoid memory allocation failure during rolling decompression

    Currently, err would be treated as io error. Therefore, it'd be
    better to ensure memory allocation during rolling decompression
    to avoid such io error.

    In the long term, we might consider adding another !Uptodate case
    for such case.

    Link: https://lore.kernel.org/r/20210316031515.90954-1-huangjianan@oppo.com
    Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Huang Jianan <huangjianan@oppo.com>
    Signed-off-by: Guo Weichao <guoweichao@oppo.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>

commit 5a664357076596a3af1100413bafb00a88dc5ef2
Author: kondors1995 <normandija1945@gmail.com>
Date:   Mon May 9 16:44:49 2022 +0000

    raphael_defconfig: Enable EROFS

commit 2409ea765730e7ca72fcc71dc3989eb37306ed81
Author: Tom Levy <tomlevy93@gmail.com>
Date:   Tue Jul 16 16:30:24 2019 -0700

    include/linux/lz4.h: fix spelling and copy-paste errors in documentation

    Fix a few spelling and grammar errors, and two places where fast/safe in
    the documentation did not match the function.

    Link: http://lkml.kernel.org/r/20190321014452.13297-1-tomlevy93@gmail.com
    Signed-off-by: Tom Levy <tomlevy93@gmail.com>
    Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
    Cc: Jiri Kosina <trivial@kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    Signed-off-by: Panchajanya1999 <panchajanya@azure-dev.live>

commit 416572f0ce1a90146cb73dd5ea3667899d0f8241
Author: John Galt <johngaltfirstrun@gmail.com>
Date:   Tue May 3 16:09:48 2022 -0400

    erofs: compression fixes

commit 8af69e641af0cd017664fbf2fbd9ce2509b2b8dc
Author: Luan Cachoroski Halaiko <luhalaiko@gmail.com>
Date:   Tue Feb 8 20:20:47 2022 -0300

    erofs: fixes for compilation

    Signed-off-by: Luan Cachoroski Halaiko <luhalaiko@gmail.com>

commit ad81e37ce0d0af5bdb0115a7eccd673c03d293f0
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Wed Dec 9 20:37:17 2020 +0800

    erofs: force inplace I/O under low memory scenario

    Try to forcely switch to inplace I/O under low memory scenario in
    order to avoid direct memory reclaim due to cached page allocation.

    Link: https://lore.kernel.org/r/20201209123717.12430-1-hsiangkao@aol.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Change-Id: I8ea2d3b59c68125271f66853cf5dc6ca39e7aaa9

commit e4018facd91f25eb223b94416d1b64f641618577
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Tue Dec 8 17:58:34 2020 +0800

    erofs: simplify try_to_claim_pcluster()

    simplify try_to_claim_pcluster() by directly using cmpxchg() here
    (the retry loop caused more overhead.) Also, move the chain loop
    detection in and rename it to z_erofs_try_to_claim_pcluster().

    Link: https://lore.kernel.org/r/20201208095834.3133565-3-hsiangkao@redhat.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Change-Id: I8d091ff44123b099ef199eaa4200a00b8854623f

commit f28d114732f644b4a6445316095db1f0e818472f
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Tue Dec 8 17:58:33 2020 +0800

    erofs: insert to managed cache after adding to pcl

    Previously, it could be some concern to call add_to_page_cache_lru()
    with page->mapping == Z_EROFS_MAPPING_STAGING (!= NULL).

    In contrast, page->private is used instead now, so partially revert
    commit 5ddcee1f3a1c ("erofs: get rid of __stagingpage_alloc helper")
    with some adaption for simplicity.

    Link: https://lore.kernel.org/r/20201208095834.3133565-2-hsiangkao@redhat.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Change-Id: If250d62b47083649e96d0937eb1990b6c84d768f

commit 1a79fe1a476ae08ed0609618951fe863df0ac03a
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Tue Dec 8 17:58:32 2020 +0800

    erofs: get rid of magical Z_EROFS_MAPPING_STAGING

    Previously, we played around with magical page->mapping for short-lived
    temporary pages since we need to identify different types of pages in
    the same pcluster but both invalidated and short-lived temporary pages
    can have page->mapping == NULL. It was considered as safe because that
    temporary pages are all non-LRU / non-movable pages.

    This patch tends to use specific page->private to identify short-lived
    pages instead so it won't rely on page->mapping anymore. Details are
    described in "compress.h" as well.

    Link: https://lore.kernel.org/r/20201208095834.3133565-1-hsiangkao@redhat.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Change-Id: I2c8650e80cb6016ed828d04f89f8bd3512ca3fb2

commit a50789da7af81e73a8cb0081e788cea5543eff5c
Author: Vladimir Zapolskiy <vladimir@tuxera.com>
Date:   Fri Oct 30 14:28:39 2020 +0200

    erofs: remove a void EROFS_VERSION macro set in Makefile

    Since commit 4f761fa253b4 ("erofs: rename errln/infoln/debugln to
    erofs_{err, info, dbg}") the defined macro EROFS_VERSION has no affect,
    therefore removing it from the Makefile is a non-functional change.

    Link: https://lore.kernel.org/r/20201030122839.25431-1-vladimir@tuxera.com
    Reviewed-by: Gao Xiang <hsiangkao@redhat.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Vladimir Zapolskiy <vladimir@tuxera.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Change-Id: Id63ad279985db2a156d62be814bf381c9bea8342

commit d929ef94d4aab35ae96fb6d6efd1a0a23f7d1b48
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Mon Aug 30 11:44:53 2021 +0800

    erofs: move from drivers/staging/ to fs/

    Since 5.4, erofs has been moved into fs/.

    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Change-Id: I95dd967a0097629a9d8eaed1dc11e2cd04f47701

commit 2758a8239cc772c63d5463073b44626ee4e7695a
Author: Gao Xiang <hsiangkao@linux.alibaba.com>
Date:   Wed Aug 25 11:42:03 2021 +0800

    erofs: sync up with kernel 5.10

    Backport 5.10 LTS erofs to 4.19.

    Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
    Change-Id: Ibf9c0c47e46090b72e75f09a347100f4ff64f28d

commit 1ee3b56216b0d92e2134d6134d2027c842f495b6
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Mon Mar 29 08:36:14 2021 +0800

    erofs: add unsupported inode i_format check

    commit 24a806d849c0b0c1d0cd6a6b93ba4ae4c0ec9f08 upstream.

    If any unknown i_format fields are set (may be of some new incompat
    inode features), mark such inode as unsupported.

    Just in case of any new incompat i_format fields added in the future.

    Link: https://lore.kernel.org/r/20210329003614.6583-1-hsiangkao@aol.com
    Fixes: 431339ba9042 ("staging: erofs: add inode operations")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 316472dda45a6a8142fc80800fa92f2846911008
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Thu Jul 30 01:58:01 2020 +0800

    erofs: fix extended inode could cross boundary

    commit 0dcd3c94e02438f4a571690e26f4ee997524102a upstream.

    Each ondisk inode should be aligned with inode slot boundary
    (32-byte alignment) because of nid calculation formula, so all
    compact inodes (32 byte) cannot across page boundary. However,
    extended inode is now 64-byte form, which can across page boundary
    in principle if the location is specified on purpose, although
    it's hard to be generated by mkfs due to the allocation policy
    and rarely used by Android use case now mainly for > 4GiB files.

    For now, only two fields `i_ctime_nsec` and `i_nlink' couldn't
    be read from disk properly and cause out-of-bound memory read
    with random value.

    Let's fix now.

    Fixes: 431339ba9042 ("staging: erofs: add inode operations")
    Cc: <stable@vger.kernel.org> # 4.19+
    Link: https://lore.kernel.org/r/20200729175801.GA23973@xiangao.remote.csb
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    [ Gao Xiang: resolve non-trivial conflicts for latest 4.19.y. ]
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ee000f1badb6ca558527d2e99e6130e56fe6acfb
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Sun Nov 1 03:51:02 2020 +0800

    erofs: derive atime instead of leaving it empty

    commit d3938ee23e97bfcac2e0eb6b356875da73d700df upstream.

    EROFS has _only one_ ondisk timestamp (ctime is currently
    documented and recorded, we might also record mtime instead
    with a new compat feature if needed) for each extended inode
    since EROFS isn't mainly for archival purposes so no need to
    keep all timestamps on disk especially for Android scenarios
    due to security concerns. Also, romfs/cramfs don't have their
    own on-disk timestamp, and squashfs only records mtime instead.

    Let's also derive access time from ondisk timestamp rather than
    leaving it empty, and if mtime/atime for each file are really
    needed for specific scenarios as well, we can also use xattrs
    to record them then.

    Link: https://lore.kernel.org/r/20201031195102.21221-1-hsiangkao@aol.com
    [ Gao Xiang: It'd be better to backport for user-friendly concern. ]
    Fixes: 431339ba9042 ("staging: erofs: add inode operations")
    Cc: stable <stable@vger.kernel.org> # 4.19+
    Reported-by: nl6720 <nl6720@gmail.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    [ Gao Xiang: Manually backport to 4.19.y due to trivial conflicts. ]
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0601575a0ca46c49aaf765aaba9df8c1ce63cc9a
Author: Gao Xiang <hsiangkao@redhat.com>
Date:   Fri Jun 19 07:43:49 2020 +0800

    erofs: fix partially uninitialized misuse in z_erofs_onlinepage_fixup

    commit 3c597282887fd55181578996dca52ce697d985a5 upstream.

    Hongyu reported "id != index" in z_erofs_onlinepage_fixup() with
    specific aarch64 environment easily, which wasn't shown before.

    After digging into that, I found that high 32 bits of page->private
    was set to 0xaaaaaaaa rather than 0 (due to z_erofs_onlinepage_init
    behavior with specific compiler options). Actually we only use low
    32 bits to keep the page information since page->private is only 4
    bytes on most 32-bit platforms. However z_erofs_onlinepage_fixup()
    uses the upper 32 bits by mistake.

    Let's fix it now.

    Reported-and-tested-by: Hongyu Jin <hongyu.jin@unisoc.com>
    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Link: https://lore.kernel.org/r/20200618234349.22553-1-hsiangkao@aol.com
    Signed-off-by: Gao Xiang <hsiangkao@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 02cee974cb788dd6b23837c04e347dbadccb7e67
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Feb 26 16:10:06 2020 +0800

    erofs: correct the remaining shrink objects

    commit 9d5a09c6f3b5fb85af20e3a34827b5d27d152b34 upstream.

    The remaining count should not include successful
    shrink attempts.

    Fixes: e7e9a307be9d ("staging: erofs: introduce workstation for decompression")
    Cc: <stable@vger.kernel.org> # 4.19+
    Link: https://lore.kernel.org/r/20200226081008.86348-1-gaoxiang25@huawei.com
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit afe022d9f5721497e63d11d3fdb06c95c6256c23
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Sun Dec 1 16:01:09 2019 +0800

    erofs: zero out when listxattr is called with no xattr

    commit 926d1650176448d7684b991fbe1a5b1a8289e97c upstream.

    As David reported [1], ENODATA returns when attempting
    to modify files by using EROFS as an overlayfs lower layer.

    The root cause is that listxattr could return unexpected
    -ENODATA by mistake for inodes without xattr. That breaks
    listxattr return value convention and it can cause copy
    up failure when used with overlayfs.

    Resolve by zeroing out if no xattr is found for listxattr.

    [1] https://lore.kernel.org/r/CAEvUa7nxnby+rxK-KRMA46=exeOMApkDMAV08AjMkkPnTPV4CQ@mail.gmail.com
    Link: https://lore.kernel.org/r/20191201084040.29275-1-hsiangkao@aol.com
    Fixes: cadf1ccf1b00 ("staging: erofs: add error handling for xattr submodule")
    Cc: <stable@vger.kernel.org> # 4.19+
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fceffbd856369cedfa23b313844d3906de8fd36e
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Oct 9 18:12:39 2019 +0800

    staging: erofs: detect potential multiref due to corrupted images

    commit e12a0ce2fa69798194f3a8628baf6edfbd5c548f upstream.

    As reported by erofs-utils fuzzer, currently, multiref
    (ondisk deduplication) hasn't been supported for now,
    we should forbid it properly.

    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Link: https://lore.kernel.org/r/20190821140152.229648-1-gaoxiang25@huawei.com
    [ Gao Xiang: Since earlier kernels don't define EFSCORRUPTED,
                 let's use EIO instead. ]
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 9b3495631f1dba2feac41c880e564df6e242c8ab
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Oct 9 18:12:38 2019 +0800

    staging: erofs: add two missing erofs_workgroup_put for corrupted images

    commit 138e1a0990e80db486ab9f6c06bd5c01f9a97999 upstream.

    As reported by erofs-utils fuzzer, these error handling
    path will be entered to handle corrupted images.

    Lack of erofs_workgroup_puts will cause unmounting
    unsuccessfully.

    Fix these return values to EFSCORRUPTED as well.

    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Link: https://lore.kernel.org/r/20190819103426.87579-4-gaoxiang25@huawei.com
    [ Gao Xiang: Older kernel versions don't have length validity check
                 and EFSCORRUPTED, thus backport pageofs check for now. ]
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 20b9eea304f612a2cff8690eebc57d228e45b95e
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Oct 9 18:12:37 2019 +0800

    staging: erofs: some compressed cluster should be submitted for corrupted images

    commit ee45197c807895e156b2be0abcaebdfc116487c8 upstream.

    As reported by erofs_utils fuzzer, a logical page can belong
    to at most 2 compressed clusters, if one compressed cluster
    is corrupted, but the other has been ready in submitting chain.

    The chain needs to submit anyway in order to keep the page
    working properly (page unlocked with PG_error set, PG_uptodate
    not set).

    Let's fix it now.

    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Link: https://lore.kernel.org/r/20190819103426.87579-2-gaoxiang25@huawei.com
    [ Gao Xiang: Manually backport to v4.19.y stable. ]
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c61556faf792f95db0edbee6646fa2f52c8515d1
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Oct 9 18:12:36 2019 +0800

    staging: erofs: fix an error handling in erofs_readdir()

    commit acb383f1dcb4f1e79b66d4be3a0b6f519a957b0d upstream.

    Richard observed a forever loop of erofs_read_raw_page() [1]
    which can be generated by forcely setting ->u.i_blkaddr
    to 0xdeadbeef (as my understanding block layer can
    handle access beyond end of device correctly).

    After digging into that, it seems the problem is highly
    related with directories and then I found the root cause
    is an improper error handling in erofs_readdir().

    Let's fix it now.

    [1] https://lore.kernel.org/r/1163995781.68824.1566084358245.JavaMail.zimbra@nod.at/

    Reported-by: Richard Weinberger <richard@nod.at>
    Fixes: 3aa8ec716e52 ("staging: erofs: add directory operations")
    Cc: <stable@vger.kernel.org> # 4.19+
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Link: https://lore.kernel.org/r/20190818125457.25906-1-hsiangkao@aol.com
    [ Gao Xiang: Since earlier kernels don't define EFSCORRUPTED,
                 let's use original error code instead. ]
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 44e25b73c4772f5f08d483bbdcfe81c95758e955
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jun 13 16:35:41 2019 +0800

    staging: erofs: add requirements field in superblock

    commit 5efe5137f05bbb4688890620934538c005e7d1d6 upstream.

    There are some backward incompatible features pending
    for months, mainly due to on-disk format expensions.

    However, we should ensure that it cannot be mounted with
    old kernels. Otherwise, it will causes unexpected behaviors.

    Fixes: ba2b77a82022 ("staging: erofs: add super block operations")
    Cc: <stable@vger.kernel.org> # 4.19+
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c458b3206aa217c67af63679c67cda21d1bb63fd
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Fri Mar 29 04:14:58 2019 +0800

    staging: erofs: keep corrupted fs from crashing kernel in erofs_readdir()

    commit 33bac912840fe64dbc15556302537dc6a17cac63 upstream.

    After commit 419d6efc50e9, kernel cannot be crashed in the namei
    path. However, corrupted nameoff can do harm in the process of
    readdir for scenerios without dm-verity as well. Fix it now.

    Fixes: 3aa8ec716e52 ("staging: erofs: add directory operations")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 77a2c8cadafb7972b2812c097f518ac3099e8a3b
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Mon Mar 25 11:40:07 2019 +0800

    staging: erofs: fix error handling when failed to read compresssed data

    commit b6391ac73400eff38377a4a7364bd3df5efb5178 upstream.

    Complete read error handling paths for all three kinds of
    compressed pages:

     1) For cache-managed pages, PG_uptodate will be checked since
        read_endio will unlock and SetPageUptodate for these pages;

     2) For inplaced pages, read_endio cannot SetPageUptodate directly
        since it should be used to mark the final decompressed data,
        PG_error will be set with page locked for IO error instead;

     3) For staging pages, PG_error is used, which is similar to
        what we do for inplaced pages.

    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 74528ff6c38df709674cc676f67e79eac815e23f
Author: Chao Yu <yuchao0@huawei.com>
Date:   Mon Mar 11 23:10:10 2019 +0800

    staging: erofs: fix to handle error path of erofs_vmap()

    commit 8bce6dcede65139a087ff240127e3f3c01363eed upstream.

    erofs_vmap() wrapped vmap() and vm_map_ram() to return virtual
    continuous memory, but both of them can failed due to a lot of
    reason, previously, erofs_vmap()'s callers didn't handle them,
    which can potentially cause NULL pointer access, fix it.

    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Fixes: 0d40d6e399c1 ("staging: erofs: add a generic z_erofs VLE decompressor")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 910cd92ee289977f064971f7659cda0228ec1615
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Fri Nov 23 01:16:00 2018 +0800

    staging: erofs: fix race when the managed cache is enabled

    commit 51232df5e4b268936beccde5248f312a316800be upstream.

    When the managed cache is enabled, the last reference count
    of a workgroup must be used for its workstation.

    Otherwise, it could lead to incorrect (un)freezes in
    the reclaim path, and it would be harmful.

    A typical race as follows:

    Thread 1 (In the reclaim path)  Thread 2
    workgroup_freeze(grp, 1)                                refcnt = 1
    ...
    workgroup_unfreeze(grp, 1)                              refcnt = 1
                                    workgroup_get(grp)      refcnt = 2 (x)
    workgroup_put(grp)                                      refcnt = 1 (x)
                                    ...unexpected behaviors

    * grp is detached but still used, which violates cache-managed
      freeze constraint.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a906ead6ff3295233d3643d662309cddb7efd896
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Mon Mar 11 14:08:58 2019 +0800

    staging: erofs: keep corrupted fs from crashing kernel in erofs_namei()

    commit 419d6efc50e94bcf5d6b35cd8c71f79edadec564 upstream.

    As Al pointed out, "
    ... and while we are at it, what happens to
    	unsigned int nameoff = le16_to_cpu(de[mid].nameoff);
    	unsigned int matched = min(startprfx, endprfx);

    	struct qstr dname = QSTR_INIT(data + nameoff,
    		unlikely(mid >= ndirents - 1) ?
    			maxsize - nameoff :
    			le16_to_cpu(de[mid + 1].nameoff) - nameoff);

    	/* string comparison without already matched prefix */
    	int ret = dirnamecmp(name, &dname, &matched);
    if le16_to_cpu(de[...].nameoff) is not monotonically increasing?  I.e.
    what's to prevent e.g. (unsigned)-1 ending up in dname.len?

    Corrupted fs image shouldn't oops the kernel.. "

    Revisit the related lookup flow to address the issue.

    Fixes: d72d1ce60174 ("staging: erofs: add namei functions")
    Cc: <stable@vger.kernel.org> # 4.19+
    Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6dbf1a15dcd2f0097d819daa4ee1926b2345d02f
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Mon Mar 11 14:08:57 2019 +0800

    staging: erofs: fix race of initializing xattrs of a inode at the same time

    commit 62dc45979f3f8cb0ea67302a93bff686f0c46c5a upstream.

    In real scenario, there could be several threads accessing xattrs
    of the same xattr-uninitialized inode, and init_inode_xattrs()
    almost at the same time.

    That's actually an unexpected behavior, this patch closes the race.

    Fixes: b17500a0fdba ("staging: erofs: introduce xattr & acl support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 044ba07158562ecf1b2e9079fa97c9980b523eb0
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Mon Mar 11 14:08:56 2019 +0800

    staging: erofs: fix memleak of inode's shared xattr array

    From: Sheng Yong <shengyong1@huawei.com>

    commit 3b1b5291f79d040d549d7c746669fc30e8045b9b upstream.

    If it fails to read a shared xattr page, the inode's shared xattr array
    is not freed. The next time the inode's xattr is accessed, the previously
    allocated array is leaked.

    Signed-off-by: Sheng Yong <shengyong1@huawei.com>
    Fixes: b17500a0fdba ("staging: erofs: introduce xattr & acl support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 240517d98c12632095f2848bd94c30debdcaf600
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Mon Mar 11 14:08:55 2019 +0800

    staging: erofs: fix fast symlink w/o xattr when fs xattr is on

    commit 7077fffcb0b0b65dc75e341306aeef4d0e7f2ec6 upstream.

    Currently, this will hit a BUG_ON for these symlinks as follows:

    - kernel message
    ------------[ cut here ]------------
    kernel BUG at drivers/staging/erofs/xattr.c:59!
    SMP PTI
    CPU: 1 PID: 1170 Comm: getllxattr Not tainted 4.20.0-rc6+ #92
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014
    RIP: 0010:init_inode_xattrs+0x22b/0x270
    Code: 48 0f 45 ea f0 ff 4d 34 74 0d 41 83 4c 24 e0 01 31 c0 e9 00 fe ff ff 48 89 ef e8 e0 31 9e ff eb e9 89 e8 e9 ef fd ff ff 0f 0$
     <0f> 0b 48 89 ef e8 fb f6 9c ff 48 8b 45 08 a8 01 75 24 f0 ff 4d 34
    RSP: 0018:ffffa03ac026bdf8 EFLAGS: 00010246
    ------------[ cut here ]------------
    ...
    Call Trace:
     erofs_listxattr+0x30/0x2c0
     ? selinux_inode_listxattr+0x5a/0x80
     ? kmem_cache_alloc+0x33/0x170
     ? security_inode_listxattr+0x27/0x40
     listxattr+0xaf/0xc0
     path_listxattr+0x5a/0xa0
     do_syscall_64+0x43/0xf0
     entry_SYSCALL_64_after_hwframe+0x44/0xa9
    ...
    ---[ end trace 3c24b49408dc0c72 ]---

    Fix it by checking ->xattr_isize in init_inode_xattrs(),
    and it also fixes improper return value -ENOTSUPP
    (it should be -ENODATA if xattr is enabled) for those inodes.

    Fixes: b17500a0fdba ("staging: erofs: introduce xattr & acl support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Reported-by: Li Guifu <bluce.liguifu@huawei.com>
    Tested-by: Li Guifu <bluce.liguifu@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 78544513d768a1559d7e61d5e29270844db027d2
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Mon Mar 11 14:08:54 2019 +0800

    staging: erofs: add error handling for xattr submodule

    commit cadf1ccf1b0021d0b7a9347e102ac5258f9f98c8 upstream.

    This patch enhances the missing error handling code for
    xattr submodule, which improves the stability for the rare cases.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit f1f405af62a3f3b37bf965ddbc3ef5aa2fab2f57
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Feb 27 13:33:30 2019 +0800

    staging: erofs: compressed_pages should not be accessed again after freed

    commit af692e117cb8cd9d3d844d413095775abc1217f9 upstream.

    This patch resolves the following page use-after-free issue,
    z_erofs_vle_unzip:
        ...
        for (i = 0; i < nr_pages; ++i) {
            ...
            z_erofs_onlinepage_endio(page);  (1)
        }

        for (i = 0; i < clusterpages; ++i) {
            page = compressed_pages[i];

            if (page->mapping == mngda)      (2)
                continue;
            /* recycle all individual staging pages */
            (void)z_erofs_gather_if_stagingpage(page_pool, page); (3)
            WRITE_ONCE(compressed_pages[i], NULL);
        }
        ...

    After (1) is executed, page is freed and could be then reused, if
    compressed_pages is scanned after that, it could fall info (2) or
    (3) by mistake and that could finally be in a mess.

    This patch aims to solve the above issue only with little changes
    as much as possible in order to make the fix backport easier.

    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b3a98208a957c0e05850b82ebf7f474ab295ff00
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Feb 27 13:33:31 2019 +0800

    staging: erofs: fix illegal address access under memory pressure

    commit 1e5ceeab6929585512c63d05911d6657064abf7b upstream.

    Considering a read request with two decompressed file pages,
    If a decompression work cannot be started on the previous page
    due to memory pressure but in-memory LTP map lookup is done,
    builder->work should be still NULL.

    Moreover, if the current page also belongs to the same map,
    it won't try to start the decompression work again and then
    run into trouble.

    This patch aims to solve the above issue only with little changes
    as much as possible in order to make the fix backport easier.

    kernel message is:
    <4>[1051408.015930s]SLUB: Unable to allocate memory on node -1, gfp=0x2408040(GFP_NOFS|__GFP_ZERO)
    <4>[1051408.015930s]  cache: erofs_compress, object size: 144, buffer size: 144, default order: 0, min order: 0
    <4>[1051408.015930s]  node 0: slabs: 98, objs: 2744, free: 0
      * Cannot allocate the decompression work

    <3>[1051408.015960s]erofs: z_erofs_vle_normalaccess_readpages, readahead error at page 1008 of nid 5391488
      * Note that the previous page was failed to read

    <0>[1051408.015960s]Internal error: Accessing user space memory outside uaccess.h routines: 96000005 [#1] PREEMPT SMP
    ...
    <4>[1051408.015991s]Hardware name: kirin710 (DT)
    ...
    <4>[1051408.016021s]PC is at z_erofs_vle_work_add_page+0xa0/0x17c
    <4>[1051408.016021s]LR is at z_erofs_do_read_page+0x12c/0xcf0
    ...
    <4>[1051408.018096s][<ffffff80c6fb0fd4>] z_erofs_vle_work_add_page+0xa0/0x17c
    <4>[1051408.018096s][<ffffff80c6fb3814>] z_erofs_vle_normalaccess_readpages+0x1a0/0x37c
    <4>[1051408.018096s][<ffffff80c6d670b8>] read_pages+0x70/0x190
    <4>[1051408.018127s][<ffffff80c6d6736c>] __do_page_cache_readahead+0x194/0x1a8
    <4>[1051408.018127s][<ffffff80c6d59318>] filemap_fault+0x398/0x684
    <4>[1051408.018127s][<ffffff80c6d8a9e0>] __do_fault+0x8c/0x138
    <4>[1051408.018127s][<ffffff80c6d8f90c>] handle_pte_fault+0x730/0xb7c
    <4>[1051408.018127s][<ffffff80c6d8fe04>] __handle_mm_fault+0xac/0xf4
    <4>[1051408.018157s][<ffffff80c6d8fec8>] handle_mm_fault+0x7c/0x118
    <4>[1051408.018157s][<ffffff80c8c52998>] do_page_fault+0x354/0x474
    <4>[1051408.018157s][<ffffff80c8c52af8>] do_translation_fault+0x40/0x48
    <4>[1051408.018157s][<ffffff80c6c002f4>] do_mem_abort+0x80/0x100
    <4>[1051408.018310s]---[ end trace 9f4009a3283bd78b ]---

    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 14b20a49fc73c4818efa3327451904ef6f9c07ab
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Feb 27 13:33:32 2019 +0800

    staging: erofs: fix mis-acted TAIL merging behavior

    commit a112152f6f3a2a88caa6f414d540bd49e406af60 upstream.

    EROFS has an optimized path called TAIL merging, which is designed
    to merge multiple reads and the corresponding decompressions into
    one if these requests read continuous pages almost at the same time.

    In general, it behaves as follows:
     ________________________________________________________________
      ... |  TAIL  .  HEAD  |  PAGE  |  PAGE  |  TAIL    . HEAD | ...
     _____|_combined page A_|________|________|_combined page B_|____
            1  ]  ->  [  2                          ]  ->  [ 3
    If the above three reads are requested in the order 1-2-3, it will
    generate a large work chain rather than 3 individual work chains
    to reduce scheduling overhead and boost up sequential read.

    However, if Read 2 is processed slightly earlier than Read 1,
    currently it still generates 2 individual work chains (chain 1, 2)
    but it does in-place decompression for combined page A, moreover,
    if chain 2 decompresses ahead of chain 1, it will be a race and
    lead to corrupted decompressed page. This patch fixes it.

    Fixes: 3883a79abd02 ("staging: erofs: introduce VLE decompression support")
    Cc: <stable@vger.kernel.org> # 4.19+
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1931a6c5fe28edd9c62d54dd67806c3806e9cdb7
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Tue Dec 11 15:17:50 2018 +0800

    staging: erofs: unzip_vle_lz4.c,utils.c: rectify BUG_ONs

    commit b8e076a6ef253e763bfdb81e5c72bcc828b0fbeb upstream.

    remove all redundant BUG_ONs, and turn the rest
    useful usages to DBG_BUGONs.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 0773d1966061cba2de6b226947470baf88feda72
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Tue Dec 11 15:17:49 2018 +0800

    staging: erofs: unzip_{pagevec.h,vle.c}: rectify BUG_ONs

    commit 70b17991d89554cdd16f3e4fb0179bcc03c808d9 upstream.

    remove all redundant BUG_ONs, and turn the rest
    useful usages to DBG_BUGONs.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6a00c9d7066562e418e30b1b211c77aed5c40551
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Dec 5 21:23:13 2018 +0800

    staging: erofs: {dir,inode,super}.c: rectify BUG_ONs

    commit 8b987bca2d09649683cbe496419a011df8c08493 upstream.

    remove all redundant BUG_ONs, and turn the rest
    useful usages to DBG_BUGONs.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ef609890e1f8f27546f25d058bcaeb3c5a7a982f
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Fri Nov 23 01:16:03 2018 +0800

    staging: erofs: add a full barrier in erofs_workgroup_unfreeze

    commit 948bbdb1818b7ad6e539dad4fbd2dd4650793ea9 upstream.

    Just like other generic locks, insert a full barrier
    in case of memory reorder.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e88d7d9adb52d0f9ba8028c6b4a13e7e83d743a5
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Fri Nov 23 01:16:02 2018 +0800

    staging: erofs: fix `erofs_workgroup_{try_to_freeze, unfreeze}'

    commit 73f5c66df3e26ab750cefcb9a3e08c71c9f79cad upstream.

    There are two minor issues in the current freeze interface:

       1) Freeze interfaces have not related with CONFIG_DEBUG_SPINLOCK,
          therefore fix the incorrect conditions;

       2) For SMP platforms, it should also disable preemption before
          doing atomic_cmpxchg in case that some high priority tasks
          preempt between atomic_cmpxchg and disable_preempt, then spin
          on the locked refcount later.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 26b9413853f64a44d858c24bc2b4c834a2e6a1fc
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Fri Nov 23 01:16:01 2018 +0800

    staging: erofs: atomic_cond_read_relaxed on ref-locked workgroup

    commit df134b8d17b90c1e7720e318d36416b57424ff7a upstream.

    It's better to use atomic_cond_read_relaxed, which is implemented
    in hardware instructions to monitor a variable changes currently
    for ARM64, instead of open-coded busy waiting.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 28e3fa73e294002f8e7c48b6e9ea92784bf9e21a
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Sat Nov 3 17:23:56 2018 +0800

    staging: erofs: remove the redundant d_rehash() for the root dentry

    commit e9c892465583c8f42d61fafe30970d36580925df upstream.

    There is actually no need at all to d_rehash() for the root dentry
    as Al pointed out, fix it.

    Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
    Cc: Al Viro <viro@ZenIV.linux.org.uk>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e3e7bbe526acfac4307a2a6d7e2aaf5222ea88de
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Sep 19 13:49:07 2018 +0800

    staging: erofs: drop multiref support temporarily

    commit e5e3abbadf0dbd1068f64f8abe70401c5a178180 upstream.

    Multiref support means that a compressed page could have
    more than one reference, which is designed for on-disk data
    deduplication. However, mkfs doesn't support this mode
    at this moment, and the kernel implementation is also broken.

    Let's drop multiref support. If it is fully implemented
    in the future, it can be reverted later.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2dd8bd1abced431fa4be477299fa9ddce4677642
Author: Chen Gong <gongchen4@huawei.com>
Date:   Tue Sep 18 22:27:28 2018 +0800

    staging: erofs: replace BUG_ON with DBG_BUGON in data.c

    commit 9141b60cf6a53c99f8a9309bf8e1c6650a6785c1 upstream.

    This patch replace BUG_ON with DBG_BUGON in data.c, and add necessary
    error handler.

    Signed-off-by: Chen Gong <gongchen4@huawei.com>
    Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a14a5cf712938fadd39fb99a8f8a46d72b19cd4d
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Tue Sep 18 22:27:25 2018 +0800

    staging: erofs: complete error handing of z_erofs_do_read_page

    commit 1e05ff36e6921ca61bdbf779f81a602863569ee3 upstream.

    This patch completes error handing code of z_erofs_do_read_page.
    PG_error will be set when some read error happens, therefore
    z_erofs_onlinepage_endio will unlock this page without setting
    PG_uptodate.

    Reviewed-by: Chao Yu <yucxhao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 381d39d1c2d471e4c318320bae60806c5d0b04bd
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Tue Sep 18 22:25:36 2018 +0800

    staging: erofs: fix a bug when appling cache strategy

    commit 0734ffbf574ee813b20899caef2fe0ed502bb783 upstream.

    As described in Kconfig, the last compressed pack should be cached
    for further reading for either `EROFS_FS_ZIP_CACHE_UNIPOLAR' or
    `EROFS_FS_ZIP_CACHE_BIPOLAR' by design.

    However, there is a bug in z_erofs_do_read_page, it will
    switch `initial' to `false' at the very beginning before it decides
    to cache the last compressed pack.

    caching strategy should work properly after appling this patch.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3dc0616d60bcc3888f5dcf4585bcc5e2131a64df
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Fri Nov 23 01:15:59 2018 +0800

    staging: erofs: fix the definition of DBG_BUGON

    [ Upstream commit eef168789866514e5d4316f030131c9fe65b643f ]

    It's better not to positively BUG_ON the kernel, however developers
    need a way to locate issues as soon as possible.

    DBG_BUGON is introduced and it could only crash when EROFS_FS_DEBUG
    (EROFS developping feature) is on. It is helpful for developers
    to find and solve bugs quickly by eng builds.

    Previously, DBG_BUGON is defined as ((void)0) if EROFS_FS_DEBUG is off,
    but some unused variable warnings as follows could occur:

    drivers/staging/erofs/unzip_vle.c: In function `init_alway:':
    drivers/staging/erofs/unzip_vle.c:61:33: warning: unused variable `work' [-Wunused-variable]
      struct z_erofs_vle_work *const work =
                                     ^~~~

    Fix it to #define DBG_BUGON(x) ((void)(x)).

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 92c97ef11b111b764dc92c5edaf9385f74c72e7d
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Sat Dec 8 00:19:12 2018 +0800

    staging: erofs: fix use-after-free of on-stack `z_erofs_vle_unzip_io'

    [ Upstream commit 848bd9acdcd00c164b42b14aacec242949ecd471 ]

    The root cause is the race as follows:
     Thread #0                         Thread #1

     z_erofs_vle_unzip_kickoff         z_erofs_submit_and_unzip

                                        struct z_erofs_vle_unzip_io io[]
       atomic_add_return()
                                        wait_event()
                                        [end of function]
       wake_up()

    Fix it by taking the waitqueue lock between atomic_add_return and
    wake_up to close such the race.

    kernel message:

    Unable to handle kernel paging request at virtual address 97f7052caa1303dc
    ...
    Workqueue: kverityd verity_work
    task: ffffffe32bcb8000 task.stack: ffffffe3298a0000
    PC is at __wake_up_common+0x48/0xa8
    LR is at __wake_up+0x3c/0x58
    ...
    Call trace:
    ...
    [<ffffff94a08ff648>] __wake_up_common+0x48/0xa8
    [<ffffff94a08ff8b8>] __wake_up+0x3c/0x58
    [<ffffff94a0c11b60>] z_erofs_vle_unzip_kickoff+0x40/0x64
    [<ffffff94a0c118e4>] z_erofs_vle_read_endio+0x94/0x134
    [<ffffff94a0c83c9c>] bio_endio+0xe4/0xf8
    [<ffffff94a1076540>] dec_pending+0x134/0x32c
    [<ffffff94a1076f28>] clone_endio+0x90/0xf4
    [<ffffff94a0c83c9c>] bio_endio+0xe4/0xf8
    [<ffffff94a1095024>] verity_work+0x210/0x368
    [<ffffff94a08c4150>] process_one_work+0x188/0x4b4
    [<ffffff94a08c45bc>] worker_thread+0x140/0x458
    [<ffffff94a08cad48>] kthread+0xec/0x108
    [<ffffff94a0883ab4>] ret_from_fork+0x10/0x1c
    Code: d1006273 54000260 f9400804 b9400019 (b85fc081)
    ---[ end trace be9dde154f677cd1 ]---

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

    Signed-off-by: Sasha Levin <sashal@kernel.org>

commit 323056dc5fbe4768311194a3a2adf14806f25074
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Tue Sep 18 22:25:33 2018 +0800

    staging: erofs: fix a missing endian conversion

    [ Upstream commit 37ec35a6cc2b99eb7fd6b85b7d7b75dff46bc353 ]

    This patch fixes a missing endian conversion in
    vle_get_logical_extent_head.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
    Signed-off-by: Sasha Levin <sashal@kernel.org>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 69f2b4eaba237770f5c696942595d064ae3340f8
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Sep 6 17:01:47 2018 +0800

    staging: erofs: rename superblock flags (MS_xyz -> SB_xyz)

    This patch follows commit 1751e8a6cb93 ("Rename superblock
    flags (MS_xyz -> SB_xyz)") and after commit ("vfs: Suppress
    MS_* flag defs within the kernel unless explicitly enabled"),
    there is no MS_RDONLY and MS_NOATIME at all.

    Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: David Howells <dhowells@redhat.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 1621b077d53285bd5127532ce160cec69adbe660
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Tue Aug 28 11:39:48 2018 +0800

    Revert "staging: erofs: disable compiling temporarile"

    This reverts commit 156c3df8d4db4e693c062978186f44079413d74d.

    Since XArray and the new mount apis aren't merged in 4.19-rc1
    merge window, the BROKEN mark can be reverted directly without
    any problems.

    Fixes: 156c3df8d4db ("staging: erofs: disable compiling temporarile")
    Cc: Matthew Wilcox <willy@infradead.org>
    Cc: David Howells <dhowells@redhat.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 3bbdccddb4ee53c0b81545226439f231ae698f65
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Mon Aug 6 11:27:53 2018 +0800

    staging: erofs: remove an extra semicolon in z_erofs_vle_unzip_all

    There is an extra semicolon in z_erofs_vle_unzip_all, remove it.

    Reported-by: Julia Lawall <julia.lawall@lip6.fr>
    Signed-off-by: zhong jiang <zhongjiang@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ee25ad8cd5b803ae4cda0116304ff15383cb6881
Author: Kristaps Čivkulis <kristaps.civkulis@gmail.com>
Date:   Sun Aug 5 18:21:01 2018 +0300

    staging: erofs: fix if assignment style issue

    Fix coding style issue "do not use assignment in if condition"
    detected by checkpatch.pl.

    Signed-off-by: Kristaps Čivkulis <kristaps.civkulis@gmail.com>
    Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 81d71d6e9a330f4471f3edec412d6124031eac46
Author: Chao Yu <yuchao0@huawei.com>
Date:   Thu Aug 2 17:39:17 2018 +0800

    staging: erofs: disable compiling temporarile

    As Stephen Rothwell reported:

    "After merging the staging tree, today's linux-next build (x86_64
    allmodconfig) failed like this:

    drivers/staging/erofs/super.c: In function 'erofs_read_super':
    drivers/staging/erofs/super.c:343:17: error: 'MS_RDONLY' undeclared (first use in this function); did you mean 'IS_RDONLY'?
      sb->s_flags |= MS_RDONLY | MS_NOATIME;
                     ^~~~~~~~~
                     IS_RDONLY
    drivers/staging/erofs/super.c:343:17: note: each undeclared identifier is reported only once for each function it appears in
    drivers/staging/erofs/super.c:343:29: error: 'MS_NOATIME' undeclared (first use in this function); did you mean 'S_NOATIME'?
      sb->s_flags |= MS_RDONLY | MS_NOATIME;
                                 ^~~~~~~~~~
                                 S_NOATIME
    drivers/staging/erofs/super.c: In function 'erofs_mount':
    drivers/staging/erofs/super.c:501:10: warning: passing argument 5 of 'mount_bdev' makes integer from pointer without a cast [-Wint-conversion]
       &priv, erofs_fill_super);
              ^~~~~~~~~~~~~~~~
    In file included from include/linux/buffer_head.h:12:0,
                     from drivers/staging/erofs/super.c:14:
    include/linux/fs.h:2151:23: note: expected 'size_t {aka long unsigned int}' but argument is of type 'int (*)(struct super_block *, void *, int)'
     extern struct dentry *mount_bdev(struct file_system_type *fs_type,
                           ^~~~~~~~~~
    drivers/staging/erofs/super.c:500:9: error: too few arguments to function 'mount_bdev'
      return mount_bdev(fs_type, flags, dev_name,
             ^~~~~~~~~~
    In file included from include/linux/buffer_head.h:12:0,
                     from drivers/staging/erofs/super.c:14:
    include/linux/fs.h:2151:23: note: declared here
     extern struct dentry *mount_bdev(struct file_system_type *fs_type,
                           ^~~~~~~~~~
    drivers/staging/erofs/super.c: At top level:
    drivers/staging/erofs/super.c:518:20: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
      .mount          = erofs_mount,
                        ^~~~~~~~~~~
    drivers/staging/erofs/super.c:518:20: note: (near initialization for 'erofs_fs_type.mount')
    drivers/staging/erofs/super.c: In function 'erofs_remount':
    drivers/staging/erofs/super.c:630:12: error: 'MS_RDONLY' undeclared (first use in this function); did you mean 'IS_RDONLY'?
      *flags |= MS_RDONLY;
                ^~~~~~~~~
                IS_RDONLY
    drivers/staging/erofs/super.c: At top level:
    drivers/staging/erofs/super.c:640:16: error: initialization from incompatible pointer type [-Werror=incompatible-pointer-types]
      .remount_fs = erofs_remount,
                    ^~~~~~~~~~~~~

    Caused by various commits creating erofs in the staging tree interacting
    with various commits redoing the mount infrastructure in the vfs tree.

    I have disabed CONFIG_EROFS_FS for now:"

    The reason of compiling error is:

    Since -next collects and merges developing patches including common vfs
    stuff from multi-trees, but those patches didn't cover erofs, such as:

    ('vfs: Suppress MS_* flag defs within the kernel unless explicitly enabled")
    https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git/commit/?h=for-next&id=109b45090d7d3ce2797bb1ef7f70eead5bfe0ff3

    ("vfs: Require specification of size of mount data for internal mounts")
    https://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs.git/commit/?h=for-next&id=0a191e4505a4f255e6513b49426213da69bf0e80

    Above vfs related patches has not been merged in staging tree, if we
    submit those erofs patches to staging mailing list and after including
    them in staging-{test,nexts} tree, it can easily cause compiling error.

    We worked out some patches to adjust those vfs change, but now we just
    submit them to -next tree temporarily to avoid compiling error.

    For potentail conflict in between erofs and vfs changes in incoming
    merge window, Stephen suggested that we can disable CONFIG_EROFS_FS
    temporarily to pass merge window, and after that we can do restore by
    reenabling CONFIG_EROFS_FS and applying those fixing patches. Also
    Greg confirmed this solution.

    So, let's disable compiling erofs for a while.

    Suggested-by: Stephen Rothwell <sfr@canb.auug.org.au>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 2d4499c8b8b78dd00788a7513373d0900013e850
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Aug 1 14:38:31 2018 +0800

    staging: erofs: remove a redundant marco in xattr

    There is no need to '#if CONFIG_EROFS_FS_XATTR' in xattr.c,
    let's remove it.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8eaefd9be86fa3d85305f05192568ba1507dab75
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Wed Aug 1 17:36:54 2018 +0800

    staging: erofs: add the missing break in z_erofs_map_blocks_iter

    This patch adds a missing break after adding the default case.

    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit b79d82f61f25e532b98f7d3a3d49b250f1728e0d
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Mon Jul 30 09:51:01 2018 +0800

    staging: erofs: use the wrapped PTR_ERR_OR_ZERO instead of open code

    Just clean up and logic doesn't change.

    Link: https://lists.01.org/pipermail/kbuild-all/2018-July/050766.html
    Fixes: d72d1ce60174 ("staging: erofs: add namei functions")
    Reported-by: kbuild test robot <lkp@intel.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Reviewed-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ef38dd74d8389a7474b0f947185f347e24419686
Author: Gao Xiang <hsiangkao@aol.com>
Date:   Sun Jul 29 13:37:57 2018 +0800

    staging: erofs: fix conditional uninitialized `pcn' in z_erofs_map_blocks_iter

    This patch adds error handling code for
    z_erofs_map_blocks_iter to fix the compiler blame.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 294e8e93272dcbdabdd6e033e4d14bbfe6d91bb7
Author: Gao Xiang <hsiangkao@aol.com>
Date:   Sun Jul 29 13:34:58 2018 +0800

    staging: erofs: fix compile error without built-in decompression support

    This patch fixes incorrect code snippets due to spilt code
    into small patches by mistake.

    Link: https://lists.01.org/pipermail/kbuild-all/2018-July/050747.html
    Link: https://lists.01.org/pipermail/kbuild-all/2018-July/050750.html
    Reported-by: kbuild test robot <lkp@intel.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit db6fedf04cecf5fa3d78a941fd068d581813dfa0
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Sat Jul 28 15:10:32 2018 +0800

    staging: erofs: fix a compile warning of Z_EROFS_VLE_VMAP_ONSTACK_PAGES

    There is a type mismatch in the definition of
    Z_EROFS_VLE_VMAP_ONSTACK_PAGES, let's fix it.

    Link: https://lists.01.org/pipermail/kbuild-all/2018-July/050707.html
    Reported-by: kbuild test robot <lkp@intel.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 73c620c52e51ff2bf93cf02509cdbd9da3d50220
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:08 2018 +0800

    staging: erofs: add a TODO and update MAINTAINERS for staging

    This patch adds a TODO to list the things to be done, and
    the relevant info to MAINTAINERS so we can take all the blame :)

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit fd66e0b7e7510165f9c2214a0e68d7025f8b8d83
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:07 2018 +0800

    staging: erofs: introduce cached decompression

    This patch adds an optional choice which can be
    enabled by users in order to cache both incomplete
    ends of compressed clusters as a complement to
    the in-place decompression in order to boost random
    read, but it costs more memory than the in-place
    decompression only.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit e84127077ff509f7204888244cf848bf9cddd794
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:06 2018 +0800

    staging: erofs: introduce VLE decompression support

    This patch introduces the basic in-place VLE decompression
    implementation for the erofs file system.

    Compared with fixed-sized input compression, it implements
    what we call 'the variable-length extent compression' which
    specifies the same output size for each compression block
    to make the full use of IO bandwidth (which means almost
    all data from block device can be directly used for decomp-
    ression), improve the real (rather than just via data caching,
    which costs more memory) random read and keep the relatively
    lower compression ratios (it saves more storage space than
    fixed-sized input compression which is also configured with
    the same input block size), as illustrated below:

            |---  variable-length extent ---|------ VLE ------|---  VLE ---|
             /> clusterofs                  /> clusterofs     /> clusterofs /> clusterofs
       ++---|-------++-----------++---------|-++-----------++-|---------++-|
    ...||   |       ||           ||         | ||           || |         || | ... original data
       ++---|-------++-----------++---------|-++-----------++-|---------++-|
       ++->cluster<-++->cluster<-++->cluster<-++->cluster<-++->cluster<-++
            size         size         size         size         size
             \                             /                 /            /
              \                      /              /            /
               \               /            /            /
                ++-----------++-----------++-----------++
            ... ||           ||           ||           || ... compressed clusters
                ++-----------++-----------++-----------++
                ++->cluster<-++->cluster<-++->cluster<-++
                     size         size         size

    The main point of 'in-place' refers to the decompression mode:
    Instead of allocating independent compressed pages and data
    structures, it reuses the allocated file cache pages at most
    to store its compressed data and the corresponding pagevec in
    a time-sharing approach by default, which will be useful for
    low memory scenario.

    In the end, unlike the other filesystems with (de)compression
    support using a relatively large compression block size, which
    reads and decompresses >= 128KB at once, and gains a more
    good-looking random read (In fact it collects small random reads
    into large sequential reads and caches all decompressed data
    in memory, but it is unacceptable especially for embedded devices
    with limited memory, and it is not the real random read), we
    select a universal small-sized 4KB compressed cluster, which is
    the smallest page size for most architectures, and all compressed
    clusters can be read and decompressed independently, which ensures
    random read number for all use cases.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ab43173ff3316c0120f9b2c3abc325a18773f30f
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:05 2018 +0800

    staging: erofs: introduce workstation for decompression

    This patch introduces another concept used by the unzip
    subsystem called 'workstation'. It can be seen as a sparse
    array that stores pointers pointed to data structures
    related to the corresponding physical blocks.

    All lookup cases are protected by RCU read lock. Besides,
    reference count and spin_lock are also introduced to
    manage its lifetime and serialize all update operations.

    'workstation' is currently implemented on the in-kernel
    radix tree approach for backward compatibility.
    With the evolution of linux kernel, it could be migrated
    into XArray implementation in the future.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 84c882ba349e57fa654b0a52d6529bff5c18c0e0
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:04 2018 +0800

    staging: erofs: introduce erofs shrinker

    This patch adds a dedicated shrinker targeting to free unneeded
    memory consumed by a number of erofs in-memory data structures.

    Like F2FS and UBIFS, it also adds:
      - sbi->umount_mutex to avoid races on shrinker and put_super
      - sbi->shrinker_run_no to not revisit recently scaned objects

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dc98494e64df2c56c3d6658f60a86b257e9735a3
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:03 2018 +0800

    staging: erofs: introduce superblock registration

    In order to introducing shrinker solution for erofs,
    let's manage all mounted erofs instances at first.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8ded5dd185d595bc3664cabc5de54b84021d3314
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:02 2018 +0800

    staging: erofs: add a generic z_erofs VLE decompressor

    Currently, this patch only simply implements LZ4
    decompressor due to its development priority.

    In the future, erofs will support more compression
    algorithm and format other than LZ4, thus a generic
    decompressor interface will be needed.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 6406d5e0a4a3a6e88c6898268c36e421d1c5006b
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:01 2018 +0800

    staging: erofs: introduce a customized LZ4 decompression

    We have to reduce the memory cost as much as possible,
    so we don't want to decompress more data beyond
    the output buffer size, however "LZ4_decompress_safe_partial"
    doesn't guarantee to stop at the arbitary end position,
    but stop just after its current LZ4 "sequence" is completed.

    Link: https://groups.google.com/forum/#!topic/lz4c/_3kkz5N6n00

    Therefore, I hacked the LZ4 decompression logic by hand,
    probably NOT the fastest approach, and hope for better
    implementation.

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit c21aeb7e5feca41005feac999d4cf446dc65a701
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:22:00 2018 +0800

    staging: erofs: globalize prepare_bio and __submit_bio

    The unzip subsystem also uses these functions,
    let's export them to internal.h.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit a5908581d539ef37d1d390e3ad647216440c0ace
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:59 2018 +0800

    staging: erofs: add erofs_allocpage

    This patch introduces an temporary _on-stack_ page
    pool to reuse the freed page directly as much as
    it can for better performance and release all pages
    at a time, it also slightly reduces the possibility of
    the potential memory allocation failure.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit bbd3e12ab2521a7c982ac0707bf8da7a0d22653b
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:58 2018 +0800

    staging: erofs: add erofs_map_blocks_iter

    This patch introduces an iterable L2P mapping
    operation 'erofs_map_blocks_iter'.
    Compared with 'erofs_map_blocks', it avoids
    a number of redundant 'release and regrab'
    processes if they request the same meta page.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 70622cae335b9140e4358d7084c10cfb3da3301c
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:57 2018 +0800

    staging: erofs: introduce pagevec for unzip subsystem

    For each compressed cluster, there is a straight-forward
    way of allocating a fixed or variable-sized (for VLE) array
    to record the corresponding file pages for its decompression
    if we decide to decompress these pages asynchronously (eg.
    read-ahead case), however it could take much extra on-heap
    memory compared with traditional uncompressed filesystems.

    This patch introduces a pagevec solution to reuse some
    allocated file page in the time-sharing approach storing
    parts of the array itself in order to minimize the extra
    memory overhead, thus only a constant and small-sized array
    used for booting the whole array itself up will be needed.

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ff29dac3b4b402729a0b75b8724793158701b1f5
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:56 2018 +0800

    staging: erofs: <linux/tagptr.h>: introduce tagged pointer

    Currently kernel has scattered tagged pointer usages hacked
    by hand in plain code, without a unique and portable functionset
    to highlight the tagged pointer itself and wrap these hacked code
    in order to clean up all over meaningless magic masks.

    Therefore, this patch introduces simple generic methods to fold
    tags into a pointer integer. It currently supports the last n bits
    of the pointer for tags, which can be selected by users.

    In addition, it will also be used for the upcoming EROFS filesystem,
    which heavily uses tagged pointer approach for high performance
    and reducing extra memory allocation.

    Link: https://en.wikipedia.org/wiki/Tagged_pointer

    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 153f5ad87b67c45b453790ce206ced7c6cc62609
Author: Chao Yu <yuchao0@huawei.com>
Date:   Thu Jul 26 20:21:55 2018 +0800

    staging: erofs: support tracepoint

    Add basic tracepoints for ->readpage{,s}, ->lookup,
    ->destroy_inode, fill_inode and map_blocks.

    Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 98dd1e3a3f42df26003ae86fd1767b03bef433a6
Author: Chao Yu <yuchao0@huawei.com>
Date:   Thu Jul 26 20:21:54 2018 +0800

    staging: erofs: introduce error injection infrastructure

    This patch introduces error injection infrastructure, with it, we can
    inject error in any kernel exported common functions which erofs used,
    so that it can force erofs running into error paths, it turns out that
    tests can cover real rare paths more easily to find bugs.

    Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 220c7448cdc4c38e5177777de23793e653969904
Author: Chao Yu <yuchao0@huawei.com>
Date:   Thu Jul 26 20:21:53 2018 +0800

    staging: erofs: support special inode

    This patch adds to support special inode, such as block dev, char,
    socket, pipe inode.

    Reviewed-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 7ed68385c49ac127e5baa07220d37cbf937e89d9
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:52 2018 +0800

    staging: erofs: introduce xattr & acl support

    This implements xattr and acl functionalities.

    Inline and shared xattrs are introduced for flexibility.
    Specifically, if the same xattr occurs for many times
    in a large number of inodes or the value of a xattr is so large
    that it isn't suitable to be inlined, a shared xattr
    kept in the xattr meta will be used instead.

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit db9bea5cf638b0683376b4118754dad0d444dd7c
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:51 2018 +0800

    staging: erofs: update Kconfig and Makefile

    This commit adds Makefile and Kconfig for erofs, and
    updates Makefile and Kconfig files in the fs directory.

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit afad040452afed7d552fb853c8613c6002e17ccb
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:50 2018 +0800

    staging: erofs: add namei functions

    This commit adds functions that transfer names to inodes.

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 4e7097e1a4a0170e8e51866a1242cc9556dcca5d
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:49 2018 +0800

    staging: erofs: add directory operations

    This adds functions for directory, mainly readdir.

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 421bfd9b50b8051aa451be073f6387bc678cccab
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:48 2018 +0800

    staging: erofs: add inode operations

    This adds core functions to get, read an inode.

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 944a5ab5bd4fc480e4099c8e5e97a6dca490a239
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:47 2018 +0800

    staging: erofs: add raw address_space operations

    This commit adds functions for meta and raw data, and also
    provides address_space_operations for raw data access.

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit 8305bea76c9178ce211e5759061d10effbba958e
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:46 2018 +0800

    staging: erofs: add super block operations

    This commit adds erofs super block operations, including (u)mount,
    remount_fs, show_options, statfs, in addition to some private
    icache management functions.

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit ae2a66470bd70480e4953ff12bc96902e0b59617
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:45 2018 +0800

    staging: erofs: add erofs in-memory stuffs

     - erofs_sb_info:
       contains erofs-specific in-memory information.

     - erofs_vnode:
       contains vfs_inode and other fs-specific information.
       same as super block, the only one in-memory definition exists.

     - erofs_map_blocks
       plays a role in the file L2P mapping

    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

commit dd48cb6b27dc6c5977e59e05c59209ddb68c2f51
Author: Gao Xiang <gaoxiang25@huawei.com>
Date:   Thu Jul 26 20:21:44 2018 +0800

    staging: erofs: add on-disk layout

    This commit adds the on-disk layout header file of erofs.

    Note that the on-disk layout is still WIP, and some fields are
    reserved for the future use by design.

    Any comments are welcome.

    Thanks-to: Li Guifu <liguifu2@huawei.com>
    Thanks-to: Sun Qiuyang <sunqiuyang@huawei.com>
    Signed-off-by: Miao Xie <miaoxie@huawei.com>
    Signed-off-by: Chao Yu <yuchao0@huawei.com>
    Signed-off-by: Gao Xiang <gaoxiang25@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-21 13:29:19 +02:00
Yangtao Li
fa8801a2c7 f2fs: use iostat_lat_type directly as a parameter in the iostat_update_and_unbind_ctx()
Convert to use iostat_lat_type as parameter instead of raw number.
BTW, move NUM_PREALLOC_IOSTAT_CTXS to the header file, adjust
iostat_lat[{0,1,2}] to iostat_lat[{READ_IO,WRITE_SYNC_IO,WRITE_ASYNC_IO}]
in tracepoint function, and rename iotype to page_type to match the definition.

Reported-by: kernel test robot <lkp@intel.com>
Reported-by: Dan Carpenter <error27@gmail.com>
Signed-off-by: Yangtao Li <frank.li@vivo.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-03-20 12:53:01 +02:00
Chao Yu
51333b6966 f2fs: introduce trace_f2fs_replace_atomic_write_block
Commit 3db1de0e582c ("f2fs: change the current atomic write way")
removed old tracepoints, but it missed to add new one, this patch
fixes to introduce trace_f2fs_replace_atomic_write_block to trace
atomic_write commit flow.

Fixes: 3db1de0e582c ("f2fs: change the current atomic write way")
Signed-off-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-03-20 12:53:00 +02:00
Christoph Hellwig
d1dc8340b7 f2fs: remove the create argument to f2fs_map_blocks
The create argument is always identicaly to map->m_may_create, so use
that consistently.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-03-20 12:52:59 +02:00
Jaegeuk Kim
66ceae877b f2fs: add block_age-based extent cache
This patch introduces a runtime hot/cold data separation method
for f2fs, in order to improve the accuracy for data temperature
classification, reduce the garbage collection overhead after
long-term data updates.

Enhanced hot/cold data separation can record data block update
frequency as "age" of the extent per inode, and take use of the age
info to indicate better temperature type for data block allocation:
 - It records total data blocks allocated since mount;
 - When file extent has been updated, it calculate the count of data
blocks allocated since last update as the age of the extent;
 - Before the data block allocated, it searches for the age info and
chooses the suitable segment for allocation.

Test and result:
 - Prepare: create about 30000 files
  * 3% for cold files (with cold file extension like .apk, from 3M to 10M)
  * 50% for warm files (with random file extension like .FcDxq, from 1K
to 4M)
  * 47% for hot files (with hot file extension like .db, from 1K to 256K)
 - create(5%)/random update(90%)/delete(5%) the files
  * total write amount is about 70G
  * fsync will be called for .db files, and buffered write will be used
for other files

The storage of test device is large enough(128G) so that it will not
switch to SSR mode during the test.

Benefit: dirty segment count increment reduce about 14%
 - before: Dirty +21110
 - after:  Dirty +18286

Signed-off-by: qixiaoyu1 <qixiaoyu1@xiaomi.com>
Signed-off-by: xiongping1 <xiongping1@xiaomi.com>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-02-01 00:31:57 +08:00
Jaegeuk Kim
bf6df137dd f2fs: refactor extent_cache to support for read and more
This patch prepares extent_cache to be ready for addition.

Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-02-01 00:31:57 +08:00
Zhang Qilong
be926f7a0d f2fs: add "c_len" into trace_f2fs_update_extent_tree_range for compressed file
The trace_f2fs_update_extent_tree_range could not record compressed
block length in the cluster of compress file and we just add it.

Signed-off-by: Zhang Qilong <zhangqilong3@huawei.com>
Reviewed-by: Chao Yu <chao@kernel.org>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Signed-off-by: Fiqri Ardyansyah <fiqri0927936@gmail.com>
2022-11-03 11:48:06 +02:00
pwnrazr
638f264111 Merge remote-tracking branch 'android-stable/android-4.14-stable' into 12.1 2022-09-14 19:56:16 +08:00
David Collins
dc6033a776 spmi: trace: fix stack-out-of-bound access in SPMI tracing functions
commit 2af28b241eea816e6f7668d1954f15894b45d7e3 upstream.

trace_spmi_write_begin() and trace_spmi_read_end() both call
memcpy() with a length of "len + 1".  This leads to one extra
byte being read beyond the end of the specified buffer.  Fix
this out-of-bound memory access by using a length of "len"
instead.

Here is a KASAN log showing the issue:

BUG: KASAN: stack-out-of-bounds in trace_event_raw_event_spmi_read_end+0x1d0/0x234
Read of size 2 at addr ffffffc0265b7540 by task thermal@2.0-ser/1314
...
Call trace:
 dump_backtrace+0x0/0x3e8
 show_stack+0x2c/0x3c
 dump_stack_lvl+0xdc/0x11c
 print_address_description+0x74/0x384
 kasan_report+0x188/0x268
 kasan_check_range+0x270/0x2b0
 memcpy+0x90/0xe8
 trace_event_raw_event_spmi_read_end+0x1d0/0x234
 spmi_read_cmd+0x294/0x3ac
 spmi_ext_register_readl+0x84/0x9c
 regmap_spmi_ext_read+0x144/0x1b0 [regmap_spmi]
 _regmap_raw_read+0x40c/0x754
 regmap_raw_read+0x3a0/0x514
 regmap_bulk_read+0x418/0x494
 adc5_gen3_poll_wait_hs+0xe8/0x1e0 [qcom_spmi_adc5_gen3]
 ...
 __arm64_sys_read+0x4c/0x60
 invoke_syscall+0x80/0x218
 el0_svc_common+0xec/0x1c8
 ...

addr ffffffc0265b7540 is located in stack of task thermal@2.0-ser/1314 at offset 32 in frame:
 adc5_gen3_poll_wait_hs+0x0/0x1e0 [qcom_spmi_adc5_gen3]

this frame has 1 object:
 [32, 33) 'status'

Memory state around the buggy address:
 ffffffc0265b7400: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
 ffffffc0265b7480: 04 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>ffffffc0265b7500: 00 00 00 00 f1 f1 f1 f1 01 f3 f3 f3 00 00 00 00
                                           ^
 ffffffc0265b7580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 ffffffc0265b7600: f1 f1 f1 f1 01 f2 07 f2 f2 f2 01 f3 00 00 00 00
==================================================================

Fixes: a9fce37481 ("spmi: add command tracepoints for SPMI")
Cc: stable@vger.kernel.org
Reviewed-by: Stephen Boyd <sboyd@kernel.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: David Collins <quic_collinsd@quicinc.com>
Link: https://lore.kernel.org/r/20220627235512.2272783-1-quic_collinsd@quicinc.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25 11:11:27 +02:00
pwnrazr
4d1f3b1e36 Merge remote-tracking branch 'android-stable/android-4.14-stable' into dev-base 2022-07-25 07:09:46 +00:00
Steven Rostedt (Google)
98d0dcf81a net: sock: tracing: Fix sock_exceed_buf_limit not to dereference stale pointer
commit 820b8963adaea34a87abbecb906d1f54c0aabfb7 upstream.

The trace event sock_exceed_buf_limit saves the prot->sysctl_mem pointer
and then dereferences it in the TP_printk() portion. This is unsafe as the
TP_printk() portion is executed at the time the buffer is read. That is,
it can be seconds, minutes, days, months, even years later. If the proto
is freed, then this dereference will can also lead to a kernel crash.

Instead, save the sysctl_mem array into the ring buffer and have the
TP_printk() reference that instead. This is the proper and safe way to
read pointers in trace events.

Link: https://lore.kernel.org/all/20220706052130.16368-12-kuniyu@amazon.com/

Cc: stable@vger.kernel.org
Fixes: 3847ce32ae ("core: add tracepoints for queueing skb to rcvbuf")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-21 20:42:43 +02:00
pwnrazr
405a898076 Merge remote-tracking branch 'android-stable/android-4.14-stable' into dev-base 2022-07-08 09:16:54 +00:00
pwnrazr
902bff81d9 Merge remote-tracking branch 'android-stable/android-4.14-stable' into dev-base-up
Conflicts:
	drivers/char/Kconfig
	fs/ext4/hash.c
	fs/ext4/namei.c
	lib/crypto/Makefile
2022-07-08 09:13:29 +00:00
Edward Wu
42c75fc81f ata: libata: add qc->flags in ata_qc_complete_template tracepoint
commit 540a92bfe6dab7310b9df2e488ba247d784d0163 upstream.

Add flags value to check the result of ata completion

Fixes: 255c03d15a ("libata: Add tracepoints")
Cc: stable@vger.kernel.org
Signed-off-by: Edward Wu <edwardwu@realtek.com>
Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02 16:18:08 +02:00
pwnrazr
cccc708ed6 Merge remote-tracking branch 'android-stable/android-4.14-stable' into dev-base 2022-06-29 08:06:55 +00:00
Yu Zhao
665f4653df UPSTREAM: mm/swap.c: don't pass "enum lru_list" to trace_mm_lru_insertion()
The parameter is redundant in the sense that it can be extracted
from the "struct page" parameter by page_lru() correctly.

Link: https://lore.kernel.org/linux-mm/20201207220949.830352-5-yuzhao@google.com/
Link: https://lkml.kernel.org/r/20210122220600.906146-5-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 861404536a3af3c39f1b10959a40def3d8efa2dd)
Bug: 228114874
Change-Id: Ia02c0c65dd427a98ffa39e9dc3e2ae701e85fad8
2022-06-27 14:32:37 +00:00
pwnrazr
e628b85dab Merge remote-tracking branch 'jaegeuk-f2fs/linux-4.14.y' 2022-06-27 14:29:45 +00:00
pwnrazr
1426f17bc3 Revert f2fs stuff because they force pushed too much
Revert "Merge branch 'jaegeuk-f2fs' into dev-pwn"

This reverts commit 8e8e81eea6a74740fab2f21a3056b0ee040ab16e, reversing
changes made to aeb3f7d0f73fb5131ac2aa298f84453213d012d4.

Revert "Merge branch 'jaegeuk-f2fs' into dev-pwn"

This reverts commit 3053c18d881d96afad797ec29b493e8ac5f3f198, reversing
changes made to cbd6d2445197d6aa7cff37b74b04c99d521940f4.
2022-06-27 14:29:25 +00:00
Jason A. Donenfeld
707c01fe19 random: remove unused tracepoints
commit 14c174633f349cb41ea90c2c0aaddac157012f74 upstream.

These explicit tracepoints aren't really used and show sign of aging.
It's work to keep these up to date, and before I attempted to keep them
up to date, they weren't up to date, which indicates that they're not
really used. These days there are better ways of introspecting anyway.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25 11:46:34 +02:00
Jason A. Donenfeld
acbf6f4851 random: use hash function for crng_slow_load()
commit 66e4c2b9541503d721e936cc3898c9f25f4591ff upstream.

Since we have a hash function that's really fast, and the goal of
crng_slow_load() is reportedly to "touch all of the crng's state", we
can just hash the old state together with the new state and call it a
day. This way we dont need to reason about another LFSR or worry about
various attacks there. This code is only ever used at early boot and
then never again.

Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25 11:46:33 +02:00
Jason A. Donenfeld
62a2b4bd3e random: simplify entropy debiting
commit 9c07f57869e90140080cfc282cc628d123e27704 upstream.

Our pool is 256 bits, and we only ever use all of it or don't use it at
all, which is decided by whether or not it has at least 128 bits in it.
So we can drastically simplify the accounting and cmpxchg loop to do
exactly this.  While we're at it, we move the minimum bit size into a
constant so it can be shared between the two places where it matters.

The reason we want any of this is for the case in which an attacker has
compromised the current state, and then bruteforces small amounts of
entropy added to it. By demanding a particular minimum amount of entropy
be present before reseeding, we make that bruteforcing difficult.

Note that this rationale no longer includes anything about /dev/random
blocking at the right moment, since /dev/random no longer blocks (except
for at ~boot), but rather uses the crng. In a former life, /dev/random
was different and therefore required a more nuanced account(), but this
is no longer.

Behaviorally, nothing changes here. This is just a simplification of
the code.

Cc: Theodore Ts'o <tytso@mit.edu>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Eric Biggers <ebiggers@google.com>
Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25 11:46:32 +02:00
Jason A. Donenfeld
55349296ba random: rather than entropy_store abstraction, use global
commit 90ed1e67e896cc8040a523f8428fc02f9b164394 upstream.

Originally, the RNG used several pools, so having things abstracted out
over a generic entropy_store object made sense. These days, there's only
one input pool, and then an uneven mix of usage via the abstraction and
usage via &input_pool. Rather than this uneasy mixture, just get rid of
the abstraction entirely and have things always use the global. This
simplifies the code and makes reading it a bit easier.

Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25 11:46:31 +02:00
Eric Biggers
5ab8e04f6d random: remove dead code left over from blocking pool
commit 118a4417e14348b2e46f5e467da8444ec4757a45 upstream.

Remove some dead code that was left over following commit 90ea1c6436d2
("random: remove the blocking pool").

Cc: linux-crypto@vger.kernel.org
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Jann Horn <jannh@google.com>
Cc: Theodore Ts'o <tytso@mit.edu>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25 11:46:29 +02:00
Theodore Ts'o
fd5e41d61e random: only read from /dev/random after its pool has received 128 bits
commit eb9d1bf079bb438d1a066d72337092935fc770f6 upstream.

Immediately after boot, we allow reads from /dev/random before its
entropy pool has been fully initialized.  Fix this so that we don't
allow this until the blocking pool has received 128 bits.

We do this by repurposing the initialized flag in the entropy pool
struct, and use the initialized flag in the blocking pool to indicate
whether it is safe to pull from the blocking pool.

To do this, we needed to rework when we decide to push entropy from the
input pool to the blocking pool, since the initialized flag for the
input pool was used for this purpose.  To simplify things, we no
longer use the initialized flag for that purpose, nor do we use the
entropy_total field any more.

Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25 11:46:26 +02:00
Vasily Averin
232db3526c tracing: incorrect isolate_mote_t cast in mm_vmscan_lru_isolate
[ Upstream commit 2b132903de7124dd9a758be0c27562e91a510848 ]

Fixes following sparse warnings:

  CHECK   mm/vmscan.c
mm/vmscan.c: note: in included file (through
include/trace/trace_events.h, include/trace/define_trace.h,
include/trace/events/vmscan.h):
./include/trace/events/vmscan.h:281:1: sparse: warning:
 cast to restricted isolate_mode_t
./include/trace/events/vmscan.h:281:1: sparse: warning:
 restricted isolate_mode_t degrades to integer

Link: https://lkml.kernel.org/r/e85d7ff2-fd10-53f8-c24e-ba0458439c1b@openvz.org
Signed-off-by: Vasily Averin <vvs@openvz.org>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-14 16:53:47 +02:00
kondors1995
c19de259b9 Merge pwnrazr/jaegeuk-f2fs 2022-05-19 08:50:04 +00:00
kondors1995
0cca5fb93e Revert "Merge branch 'dev/mglru' into dev/12-2"
This reverts commit e3d1ce4d09, reversing
changes made to d3cf1f4d72.
2022-05-17 08:16:01 +00:00
kondors1995
e3d1ce4d09 Merge branch 'dev/mglru' into dev/12-2 2022-05-11 14:43:25 +00:00
Wilson Sung
7f27cb8ed1 thermal: tracing: Move clock_set_rate outsides CONFIG_COMMON_CLK_MSM
Bug: 149660093
Bug: 150825703
Change-Id: I5608b70c29f8a3aa4a3646436d1059ca155b8c29
Signed-off-by: Wilson Sung <wilsonsung@google.com>
Signed-off-by: UtsavBalar1231 <utsavbalar1231@gmail.com>
2022-05-11 14:13:02 +00:00
Yu Zhao
d32868bb0d UPSTREAM: mm/swap.c: don't pass "enum lru_list" to trace_mm_lru_insertion()
The parameter is redundant in the sense that it can be extracted
from the "struct page" parameter by page_lru() correctly.

Link: https://lore.kernel.org/linux-mm/20201207220949.830352-5-yuzhao@google.com/
Link: https://lkml.kernel.org/r/20210122220600.906146-5-yuzhao@google.com
Signed-off-by: Yu Zhao <yuzhao@google.com>
Reviewed-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(cherry picked from commit 861404536a3af3c39f1b10959a40def3d8efa2dd)
Bug: 228114874
Change-Id: Ia02c0c65dd427a98ffa39e9dc3e2ae701e85fad8
2022-05-03 11:10:20 +00:00
Wei Wang
92ed42c092 trace: sched: add capacity change tracing
Add a new tracepoint sched_capacity_update when capacity value
updated.

Bug: 144177658
Test: Boot and grab trace to check
Change-Id: I30ee55bfcc2fb5a92dd448ad364768ee428f3cc4
Signed-off-by: Wei Wang <wvw@google.com>
2022-04-18 11:35:45 +00:00