Leet269 - Alien Dictionary

Description

There is a new alien language which uses the latin alphabet. However, the order among letters are unknown to you. You receive a list of non-empty words from the dictionary, where words are sorted lexicographically by the rules of this new language. Derive the order of letters in this language.

> Solve problem here

Example 1:

Given the following words in dictionary,

1
2
3
4
5
6
7
[
"wrt",
"wrf",
"er",
"ett",
"rftt"
]

The correct order is: "wertf".

Example 2:

Given the following words in dictionary,

1
2
3
4
[
"z",
"x"
]

The correct order is: "zx".

Example 3:

Given the following words in dictionary,

1
2
3
4
5
[
"z",
"x",
"z"
]

The order is invalid, so return "".

Note:

  • You may assume all letters are in lowercase.
  • You may assume that if a is a prefix of b, then a must appear before b in the given dictionary.
  • If the order is invalid, return an empty string.
  • There may be multiple valid order of letters, return any one of them is fine.

Test cases:

Input validations:

Regular cases:

COrner cases:

  • zero/one word in dictionary
  • assumption that not ALL letter will be ordered
  • invalid order in dictionary, cycle detected
  • Do I need to get ALL letter? for the adj matrix? seems not necessary. Without getting all letters, how to know if there is a cycle.

Thoughts

Different from Course Schedule, Course Schedule II (TODO)

  • total number course/nodes was unknown
  • node/letter relation need to be derived from dictionary

BFS Topological Sort

The first step of this problem is to store node relations from words in dictionary. graph need to be initialized, node relationship is built, and inEdge count is also performed.

Example:

1
2
3
4
5
6
7
8
9
10
dict = ['wrt', 'wcg']
from dict, we only know

r -> c,

Letter count is 5.
`r` is letter with zero in-edge count.
letter `w`, `c`, `g` are stand-alone letters.

one of the answers could be `wtgrc`

The second step goes over nodes with zero in-edge and process the toplogical order of the graph for one of the valid alien letter order. BFS is used during the topolical sorting.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
public String alienOrder(String[] words) {
// build graph
Map<Character, Set<Character>> graph = new HashMap<>();
int[] inEdges = new int[26];
buildGraph(words, graph, inEdges);

// topological sort
String res = topoSort(graph, inEdges);

// check for cycle
if (res.length() != graph.size()) return "";

// return solution
return res;
}

// build graph & count inEdge
public void buildGraph(String[] words, Map<Character, Set<Character>> graph, int[] inEdges) {
// init adj list
for (String word : words) {
for (char c : word.toCharArray()) {
graph.putIfAbsent(c, new HashSet<>()); // putIfAbsent()
}
}

// build adj list
String pre = words[0];
for (int i = 1; i < words.length; i++) {
String cur = words[i];
int len = Math.min(pre.length(), cur.length());

for (int j = 0; j < len; j++) {
char c1 = pre.charAt(j);
char c2 = cur.charAt(j);
if (c1 == c2) continue;

graph.get(c1).add(c2);
inEdges[c2 - 'a'] += 1;
break; // import
}
pre = cur; // import
}
}

public String topoSort(Map<Character, Set<Character>> graph, int[] inEdges) {
Deque<Character> queue = new ArrayDeque<>();

// enque zero indegree node
for (char c : graph.keySet()) { // keySet()
if (inEdges[c - 'a'] == 0) {
queue.offer(c);
}
}

// topological sort
StringBuilder sb = new StringBuilder();
while (!queue.isEmpty()) {
char parent = queue.poll();
sb.append(parent);
for (char child : graph.get(parent)) {
inEdges[child - 'a'] -= 1;
if (inEdges[child - 'a'] == 0) {
queue.offer(child);
}
}
}
return sb.toString();
}

DFS Topological Sort

https://github.com/awangdev/LintCode/blob/master/Java/Alien%20Dictionary.java

Followup

  • There may be multiple valid order of letters, return the smallest in lexicographical order

The follow up want one of the solutions with smallest lexicogrpahical order. We just need to replace Queue with PriorityQueue to make it to work.

1
2
// Deque<Character> queue = new ArrayDeque<>();
PriorityQueue<Character> queue = new PriorityQueue<>();

Summary

  • find cycle in graph (DFS light weighted, BFS too much)
  • directed graph
  • count total number of node/construct adj matrix or list
  • get one of the topological sort

Logs

Reference