(github) / BQN / doc

Scan

The 1-modifier Scan (`) moves along the first axis of the array 𝕩, building up an array of results by applying 𝔽 repeatedly beginning with 𝕨 or ⊏𝕩. It's related to the fold modifiers, and most closely resembles the APL2-style reduction ¨˝, but it traverses the array in forward rather than reverse index order, and includes all intermediate results of 𝔽 in its output instead of just the final one.

BQN's Scan is ordered differently from Scan in APL. Both include one result for each non-empty prefix of 𝕩. In BQN this is a left-to-right fold, so that each new result requires one application of 𝔽. APL uses right-to-left folds, which matches with reduction, but requires starting over at the end for each new prefix, except in special cases. If needed, this definition can be obtained with a fold on each prefix except the first (which is empty). In the particular case of -⍀, that nested solution isn't needed: negate odd-indexed elements and then apply +`.

Scan also differs from Fold or Insert in that it never depends on 𝔽's identity value, because scanning over an empty array simply returns that array.

Lists

The best-known use of Scan is the prefix sum of a list, in which each element of the result is the sum of that element and all the ones before it. With a shift this can be modified to sum the previous elements only.

↗️

    +` 2‿4‿3‿1
⟨ 2 6 9 10 ⟩

    +`»2‿4‿3‿1  # Exclusive prefix sum
⟨ 0 2 6 9 ⟩

The pattern is generalized to any function 𝔽. With an operand of ×, it can find the first n factorials. With Maximum (⌈), it returns the largest element so far.

↗️

    ×` 1+↕6
⟨ 1 2 6 24 120 720 ⟩

    ⌈` ¯1‿¯2‿0‿4‿2‿1‿5‿¯2
⟨ ¯1 ¯1 0 4 4 4 5 5 ⟩

If provided, 𝕨 gives a starting element for Scan (actually a starting cell, so a single element should be enclosed). Below it ensures that all results of ⌈` are at least 0. In either valence, the shape of the result is always the same as the shape of 𝕩.

↗️

    0 ⌈` ¯1‿¯2‿0‿4‿2‿1‿5‿¯2
⟨ 0 0 0 4 4 4 5 5 ⟩

To see the structure of the computation, it can be helpful to use a symbolic operand 𝔽 that returns a string describing its own application.

↗️

    {"("∾𝕨∾")𝔽"∾𝕩}` "a"‿"b"‿"c"‿"d"
⟨ "a" "(a)𝔽b" "((a)𝔽b)𝔽c" "(((a)𝔽b)𝔽c)𝔽d" ⟩

    (<"w") {"("∾𝕨∾")𝔽"∾𝕩}` "a"‿"b"‿"c"‿"d"
⟨ "(w)𝔽a" "((w)𝔽a)𝔽b" "(((w)𝔽a)𝔽b)𝔽c" "((((w)𝔽a)𝔽b)𝔽c)𝔽d" ⟩

The left argument in each result element is always the previous element, if there is one. Result elements are produced in index order and this element will be reused, rather than computing it again. This can be confirmed by adding a counter to 𝔽, which shows here that scanning a 10-element list makes 9 calls (supplying an initial value would make it 10).

↗️

    c←0
    {c+↩1⋄𝕨+𝕩}` ↕10
⟨ 0 1 3 6 10 15 21 28 36 45 ⟩
    c
9

Some other useful scans apply to boolean lists. The function ∨` (with Or) tests whether this or any previous element is 1, so that the result starts at 0 but permanently switches to 1 as soon as the first 1 is found. Similarly, ∧` turns all instances of 1 after the first 0 to 0.

↗️

    ∨` 0‿0‿1‿0‿0‿1‿0‿1
⟨ 0 0 1 1 1 1 1 1 ⟩

    ∧` 1‿1‿1‿0‿0‿1‿0‿1
⟨ 1 1 1 0 0 0 0 0 ⟩

A more complicated boolean scan, which depends on the left-to-right ordering, is <`. It turns off every other 1 in a group of them—can you see why? One use is to resolve questions regarding backslash escaping: the simple example below removes backslashes except those that are escaped by more backslashes.

↗️

    <` 0‿0‿1‿1‿1‿0‿0‿1‿1‿1‿1
⟨ 0 0 1 0 1 0 0 1 0 1 0 ⟩

    {¬<`'\'=𝕩}⊸/ "ab\\\rs\\\\"
"ab\rs\\"

Reverse scan

We've discussed how the scan moves forward along 𝕩, so that each time 𝔽 takes an old result as 𝕨 and a new value as 𝕩. This means that results correspond to prefixes and go left to right on each one. Since the most important scans have associative, commutative operands, the left-to-right ordering often doesn't make a difference. But sometimes a suffix rather than prefix scan is wanted. For these cases, Scan Under Reverse (`⌾⌽) does the trick.

↗️

    ∨`   0‿0‿1‿0‿0‿1‿0
⟨ 0 0 1 1 1 1 1 ⟩

    ∨`⌾⌽ 0‿0‿1‿0‿0‿1‿0
⟨ 1 1 1 1 1 1 0 ⟩

This function reverses the input, does the scan, and reverses the output. Perhaps not so easy to visualize, but a symbolic operand will again show what it's doing:

↗️

    {"("∾𝕨∾")𝔽"∾𝕩}`⌾⌽ "a"‿"b"‿"c"‿"d"
⟨ "(((d)𝔽c)𝔽b)𝔽a" "((d)𝔽c)𝔽b" "(d)𝔽c" "d" ⟩

The new value is still the right argument to 𝔽, even though with the reversal it's to the left of any values previously seen. If 𝔽 isn't commutative, and this is the wrong order, then 𝔽˜` will switch it around.

↗️

    {"("∾𝕨∾")𝔽"∾𝕩}˜`⌾⌽ "a"‿"b"‿"c"‿"d"
⟨ "(a)𝔽(b)𝔽(c)𝔽d" "(b)𝔽(c)𝔽d" "(c)𝔽d" "d" ⟩

Higher ranks

Scan moves along the leading axis of 𝕩: vertically, for a table. To apply a scan to later axes, use ˘ or ⎉. Since a scan returns an array with the same shape as its argument, this can't cause an error from differing result cell shapes, unlike Fold or Insert.

↗️

    ⊢ a ← ¯2‿0.25‿'a'‿∞ ∾ 3‿4⥊¯1‿0‿1
┌─                
╵ ¯2 0.25 'a' ∞   
  ¯1 0    1   ¯1  
  0  1    ¯1  0   
  1  ¯1   0   1   
                 ┘

    +` a
┌─               
╵ ¯2 0.25 'a' ∞  
  ¯3 0.25 'b' ∞  
  ¯3 1.25 'a' ∞  
  ¯2 0.25 'a' ∞  
                ┘

If 𝕨 is given, it must have the same shape as a major cell of 𝕩 (this is why 𝕨 needs to be enclosed when 𝕩 is a list: in general it's an array). Then the first result cell is found by applying 𝔽 to elements of 𝕨 and ⊏𝕩, and the computation continues as in the one-argument case for remaining cells.

↗️

    3‿2‿1‿0 +` a
┌─              
╵ 1 2.25 'b' ∞  
  0 2.25 'c' ∞  
  0 3.25 'b' ∞  
  1 2.25 'b' ∞  
               ┘

Results are produced in index order. This means that instead of moving along each column in turn, a scan produces the first result cell one element at a time, then the next, and so on. Something like a breadth-first as opposed to depth-first ordering.

Definition

Scan admits a simple recursive definition. 𝕩 is an array of rank one or more and 𝕨, if given, is an atom or array with shape 1↓≢𝕩. The result z←𝕨𝔽`𝕩 is an array with the same shape as 𝕩. If it has length at least one, ⊏z is ⊏𝕩 if 𝕨 isn't given and 𝕨𝔽¨⊏𝕩 if it is. For 0≤i, (i+1)⊏z is (i⊏z)𝔽¨(i+1)⊏𝕩.

The ordering of 𝔽 application is the natural one for this definition: cells are computed in turn, and each instance of 𝔽¨ goes in index order.