Sentence Similarity

LintCode Q 856 - Sentence Similarity

Given two sentences words1, words2 (each represented as an array of strings), and a list of similar word pairs pairs, determine if two sentences are similar.

For example, words1 = great acting skills and words2 = fine drama talent are similar, if the similar word pairs are pairs = [[“great”, “fine”], [“acting”,”drama”], [“skills”,”talent”]].

Note that the similarity relation is not transitive. For example, if “great” and “fine” are similar, and “fine” and “good” are similar, “great” and “good” are not necessarily similar.

However, similarity is symmetric. For example, “great” and “fine” being similar is the same as “fine” and “great” being similar.

Also, a word is always similar with itself. For example, the sentences words1 = [“great”], words2 = [“great”], pairs = [] are similar, even though there are no specified similar word pairs.

Finally, sentences can only be similar if they have the same number of words. So a sentence like words1 = [“great”] can never be similar to words2 = [“doubleplus”,”good”].

Similar Questions: Isomorphic Strings

Solution

Solution : HashMap

Code:

public boolean isSentenceSimilarity(String[] words1, String[] words2, List<List<String>> pairs) {
	if (words1.length != words2.length) return false;
	
	// build the mapping, one word may mapping to many words
	Map<String, HashSet<String>> map = new HashMap<>(); // using Set<String> is wrong
	Set<String> set;
	for (List<String> pair: pairs) {
		set = map.getOrDefault(pair.get(0), new HashSet<>());
		set.add(pair.get(1));
		map.put(part.get(0), set);
	}

	for (int i = 0; i < words1.length; i++) {
		if (words1[i].equals(words2[i]))  // a word is similar to itself
			return true;
		if ( (map.get(words1[i]) == null || !map.get(words1[i]).contains(words2[i])) && 
		     (map.get(words2[i]) == null || !map.get(words2[i]).contains(words1[i])) )
			 return false;
	}

	return true;

}

   Reprint policy


《Sentence Similarity》 by Tong Shi is licensed under a Creative Commons Attribution 4.0 International License
  TOC