Scala: recursively calling span

The span function in scala uses a predicate to break a list into two parts: the first elements matching the predicate, and the rest of the list.

If you call this recursively, you can break the list up into pieces.

In this case, we’ll take a list of tuples, which are words in a sentence, and their part of speech pairing.

For words labeled as proper nouns, we want to find all the contiguous proper noun pairings, and print them out.

type WordPOS = (String, String)
type TaggedSentence = List[WordPOS]
type GroupedPOS = List[TaggedSentence]

def outersplitter(results: TaggedSentence): GroupedPOS = {
  type SideBySideSentence = List[(WordPOS, WordPOS)]

  val cmp: SideBySideSentence =
    results.zip(
      ("", "") :: results
    )

   def removeIndex( x:1 = {
    x._1
  }

  def splitter(results: SideBySideSentence): GroupedPOS = {
    def next(cmp: List[2
    }

    val iter = next(results.zipWithIndex)
    val first = iter._1.map(_._1)

    val second: GroupedPOS =
      iter._2 match {
        case List() => {
          List[TaggedSentence]()
        }
        case _ => {
          val nextList = results.drop(iter._1.size)
          splitter(nextList)
        }
      }

    first :: second
  }

  splitter(cmp)
}

val sentencePairings = outersplitter(results)

val possiblePeople =
  sentencePairings.filter(
    (aGrouping) => {
      aGrouping.head._1 == "NNP"
    }
  ).map(
    _.map(
      _._2
    ).mkString(" ")
  )

The real pain of the whole thing is that you need to track if you’re at the head of the list or not, when doing the span.

If you can find a way to make this more compact, please let me know!

  1. WordPOS, WordPOS), Int []
  2. WordPOS, WordPOS), Int)]) = { val res = cmp.span( (pairings) => (pairings._1._1._1 == pairings._1._2._1 || pairings._2 == 0) ) Tuple2(res._1.map(removeIndex), res._2.map(removeIndex []