UPSTREAM: perf/arm-cmn: Optimise DTM counter reads

When multiple nodes of the same type are connected to the same XP
(particularly in CAL configurations), it seems that they are likely
to be consecutive in logical ID. Therefore, we're likely to gain a
small benefit from an easy tweak to optimise out consecutive reads
of the same set of DTM counters for an aggregated event.

Bug: 247832918
Change-Id: Iab3e36dbbbd22426930ff058b4245dab0e266955
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Link: https://lore.kernel.org/r/7777d77c2df17693cd3dabb6e268906e15238d82.1638530442.git.robin.murphy@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 847eef94e6327dd7690bfac0bd3a81a7ba6aa1ee)
Signed-off-by: Robin Peng <robinpeng@google.com>
This commit is contained in:
Robin Murphy
2021-12-03 11:44:56 +00:00
committed by Treehugger Robot
parent 86d818633c
commit fb9091356e

View File

@@ -690,18 +690,19 @@ static void arm_cmn_pmu_disable(struct pmu *pmu)
static u64 arm_cmn_read_dtm(struct arm_cmn *cmn, struct arm_cmn_hw_event *hw,
bool snapshot)
{
struct arm_cmn_dtm *dtm = NULL;
struct arm_cmn_node *dn;
unsigned int i, offset;
u64 count = 0;
unsigned int i, offset, dtm_idx;
u64 reg, count = 0;
offset = snapshot ? CMN_DTM_PMEVCNTSR : CMN_DTM_PMEVCNT;
for_each_hw_dn(hw, dn, i) {
struct arm_cmn_dtm *dtm = &cmn->dtms[dn->dtm];
int dtm_idx = arm_cmn_get_index(hw->dtm_idx, i);
u64 reg = readq_relaxed(dtm->base + offset);
u16 dtm_count = reg >> (dtm_idx * 16);
count += dtm_count;
if (dtm != &cmn->dtms[dn->dtm]) {
dtm = &cmn->dtms[dn->dtm];
reg = readq_relaxed(dtm->base + offset);
}
dtm_idx = arm_cmn_get_index(hw->dtm_idx, i);
count += (u16)(reg >> (dtm_idx * 16));
}
return count;
}