Skip to content
This repository has been archived by the owner on Jan 13, 2022. It is now read-only.

Develop #38

Open
wants to merge 526 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
526 commits
Select commit Hold shift + click to select a range
121c1c7
Fix build SnapshotNode imports
Sep 5, 2012
7d29d1c
HDFS: fix test bug in TestTransferBlock
Sep 5, 2012
257b67d
FsShellService: Fix bugs
Sep 5, 2012
c55461c
Rethrow the exception
Sep 6, 2012
c3540c4
Fix re-replication in BlockPlacementPolicyConfigurable.
Sep 6, 2012
bf45dc4
[HDFS Build] fix build_all failure by excluding namespace-notifier bu…
Sep 6, 2012
8b0fb89
Fixing probably-missed sync in VolumeMap.update
Sep 6, 2012
7e3eb8b
[L38-10-bump] Plugin JournalManager to hdfs-core
Sep 6, 2012
59f3765
[L38-11-elopt] Port edit log optimizations.
Sep 6, 2012
b71fe60
[L38-12-bumpavatar] Plugin journal manager into avatarnodes
Sep 6, 2012
6f51c33
Fix for the SocketTimeoutException
Sep 6, 2012
d82b47a
[L38-13-fixes] Fixing some perf issues and todo's for new layout
Sep 6, 2012
278a738
[L38-14-fix] Fix problem with syncing in Ingest
Sep 6, 2012
ff9bdc9
[L38-15-fixes] Enable checkpointing for test case
Sep 6, 2012
3353286
HDFS FsShellService: add getFileStatus() and exists()
Sep 6, 2012
e5a1cb7
Add the GC counters to job status Task #1358198
Sep 6, 2012
0104d76
Set fs.ha.retrywrites for AvatarBalancer.
Sep 6, 2012
32b0869
[L38-17-fix] Fix corner case with non-clean primary shutdown
Sep 6, 2012
45b842c
[L38-18-metrics] Add lastTransactionId to avatar metrics
Sep 6, 2012
a84bffe
HDFS Fsck doesn't show total number of data nodes or racks if user as…
Sep 6, 2012
d38ba8f
[L38-20-rolladmin] Add dfsadmin command rollEditLog
Sep 6, 2012
4ea84f3
[L38-21-merge] Fix merging bug
Sep 6, 2012
4c7011e
Enhance FsShell to support hardlink.
Sep 7, 2012
49e3c72
Add total mapper cpu and total reducer cpu into job stats log (Hadoop…
Sep 7, 2012
da887b2
Improve concurrency of the client cache
Sep 7, 2012
9b37f29
Print the JVM environment before starting
Sep 7, 2012
3f4c7c6
Update Hoopla test for more choices: local mode and disable checksum …
Sep 7, 2012
1b23cc3
HDFS AvatarDataNode to resolve DNS address on some failures
Sep 7, 2012
d0400e4
Retries in JobTrackerReporter
Sep 7, 2012
5cbc947
Shutdown responder cleanly on avatarnode shutdown
Sep 7, 2012
d541964
[L38-22-tests] Fixing test cases after merge
Sep 7, 2012
2080ba5
Dump threads if the task doesn't progress
Sep 7, 2012
77bbb3c
Fix the retry in JobTrackerReporter
Sep 8, 2012
22f2196
Disallow a name directory to be auto-formatted on error
Sep 8, 2012
4ab2522
Fix broken hadoop build
Sep 8, 2012
4305f79
[L38-23-fixes] Stop namenode proxy when initializing Standby
Sep 10, 2012
ae8b5ff
Introducing a timeout for umbilical
Sep 10, 2012
7671d62
Fix timeout handling
Sep 10, 2012
df3bdaa
Fix the protocol getting
Sep 10, 2012
10ad3ff
Use PureJavaCRC32 and add a local write benchmark.
Sep 10, 2012
222ea80
Modification of Hadoop codebase to allow corona task tracker to utili…
Sep 10, 2012
6ad7c83
Quick fix to DatDirFileReader
Sep 10, 2012
366ab45
Add aliases to the list of federation keys.
Sep 11, 2012
0184b37
change start-corona script to call start-proxyjt.sh to first
Sep 11, 2012
111f906
Dump threads when CJT doesn't heartbeat
Sep 11, 2012
5905046
Increaseing zookeeper connect and session timeout for unit tests.
Sep 11, 2012
2f8abfd
Add datanode to outstanding list during registration.
Sep 12, 2012
ab33c13
Fix namenode formatting logic
Sep 12, 2012
877dc6f
Log the CTT exception better
Sep 12, 2012
b1887cd
[L38-24-tests] Clean up checkpointing in tests
Sep 12, 2012
49a7edf
Randomly choose ports instead of sequentially.
Sep 12, 2012
c5f3d91
[L38-25-ckpt] Add logic for readin fs.checkpoint.size
Sep 12, 2012
ac64e55
Fix port conflicts in the tests
Sep 12, 2012
9819deb
Update the GC counters. 1) Only get the heap size for Old gen. 2) Add…
Sep 12, 2012
7827110
FsShellService: make backlog queue configurable and add File Not Foun…
Sep 12, 2012
5b96eb7
FSShellService: file didn't check in for D571836
Sep 12, 2012
aabdc65
HDFS DataNode: remove an unnecessary string format
Sep 12, 2012
98aac1d
Enhance image visitors for hardlinks.
Sep 13, 2012
4ab4a10
API get list of files hardlinked to a given file.
Sep 13, 2012
5fe67fb
Fix trunk build.
Sep 13, 2012
6b31a33
Revert "Randomly choose ports instead of sequentially."
Sep 13, 2012
205e697
Add a HighMemoryUsageKilled counter for a Hadoop job
Sep 13, 2012
419387d
andomly choose ports for MiniDfsCluster
Sep 13, 2012
36969ef
Enforce that at least one image directory is present
Sep 13, 2012
a27e89a
Add condition to check if edit log is open
Sep 14, 2012
8af5518
Add hardlinkId to FileStatusExtended and fix bug in totalFiles.
Sep 14, 2012
fceb42a
Fix bugs in TestAvatarCheckpointing.
Sep 14, 2012
a6190dd
Verify total number of inodes during failover.
Sep 14, 2012
a5e5c8b
Fix the regression caused by D349832.
Sep 14, 2012
89112f1
Add log4j.properties.scribeappender to VENDOR/hadoop-0.20/conf.
Sep 15, 2012
2c369d3
Another fix for TestAvatarCheckpointing.
Sep 15, 2012
2758162
Modification to refreshVolumes to allow for the datanode to also remo…
Sep 17, 2012
e6a925f
Retry to instantiate datanodes and avataranodes for miniavatarcluster
Sep 17, 2012
72c53fe
Remove the default value for volumes tolerated
Sep 17, 2012
16492b7
Add additional check for dir executable when checking directories
Sep 17, 2012
ed93488
Lost task tracker should not fail succeeded map tasks in map-only job
Sep 18, 2012
147df5b
Log the location of task logs.
Sep 19, 2012
91843ac
Extend NNBench to support hardlinks.
Sep 19, 2012
0e72806
Upgrade ZooKeeper to 3.4.3
Nov 8, 2012
f6b3972
HDFS: Backward compabile SendBlock constructor
Sep 6, 2012
2197e0e
Command line option to start avatarnodes with NULL zookeeper.
Sep 19, 2012
ad6a08d
Add setReuseAddress to free ports in MiniAvatarCluster
Sep 19, 2012
b21cab1
Fix TestStandbySafeMode.
Sep 20, 2012
32735ed
HDFS DataNode: avoid two string concatenation when ClientTraceLog is …
Sep 20, 2012
09c3859
Allow Namenode to exclude/deprioritize datanodes which do not heartbe…
Sep 21, 2012
da6d0f4
Do not update replication queues in removeStoredBlock when replicatio…
Sep 22, 2012
ef192ce
Add retries for connecting zk
Sep 22, 2012
0150073
BlockDecompressorStream not to return -1 on IOExeption until EOFExcep…
Sep 24, 2012
81628cd
Make sure remote job tracker exits the proces
Sep 24, 2012
798befb
Fix build error
Sep 24, 2012
d4920b2
Fix problem with upgrade
Sep 24, 2012
303c90f
Gracefully recovery from runtime exceptions in logSync
Sep 24, 2012
29ec5aa
savedLocalDirs should be null, not ""
Sep 24, 2012
a78e818
Send IP address as part of resource grant
Sep 25, 2012
3130262
Improve task launching logic in corona job tracker
Sep 25, 2012
f4a806e
Enhance image/edits config validator for supporting uris
Sep 25, 2012
5345c71
Fix saveVersion for svn checkouts
Sep 25, 2012
8867547
Revert "Enhance image/edits config validator for supporting uris"
Sep 25, 2012
3ba622e
Enhance image/edits config validator for supporting uris
Sep 25, 2012
39b82c4
[HDFS Tool] Parse namenode generation stamp and output a fake block f…
Sep 26, 2012
b4c77d0
D582382 breaks Unit test
Sep 26, 2012
623cc05
FsShellService: add skipTrash option to delete
Sep 26, 2012
169183f
Allow hadoop daemons to run as root
Sep 26, 2012
7060d30
Make sure HttpServer on the namenode shuts down quickly.
Sep 26, 2012
15cf33d
Checking if file has been uploaded to DistributedCache without fetchi…
Sep 26, 2012
64cace4
Avoid waiting for ever on stuck locality thread.
Sep 27, 2012
a2e6c8f
Test of HDFS delete() on file being written
Sep 27, 2012
903d54f
Fix assert in NNStorage
Sep 27, 2012
28d77b9
Fix bug in FSEditLog
Sep 27, 2012
80b2c71
Improve shuffle metrics (more accurate success/failure/serious failure)
Sep 27, 2012
9fee396
FSShell copy uses wrong file system to check isDirectory()
Sep 27, 2012
779db85
Track the number of java processes running on a node
Sep 27, 2012
fa13f58
Fix race condition in processing revoked grants
Sep 27, 2012
908f1bf
Fix bug in FSEditLog
Sep 27, 2012
76590be
Interrupt lease manager before checking the block count
Sep 27, 2012
3177728
Add validation of standby/secondary namenode
Sep 27, 2012
0319e9c
Add "include" option for build-contrib.xml
Sep 27, 2012
150ac90
Job setup, job cleanup, and task cleanup tasks need their own jvm op…
Sep 27, 2012
d77e924
Task error classification for corona.
Sep 28, 2012
0a079bf
Fix dfsclusterhealth
Sep 28, 2012
b627759
Avoid DNS lookup if IP address is already available.
Sep 28, 2012
6645f53
Extend bug fixes done in rH26579 to AvatarDataNode
Sep 28, 2012
0308805
Instantiate only a single lease checker per namenode in Fastcopy.
Sep 29, 2012
157fee8
Enable private actions always in corona jsps
Oct 1, 2012
7f48ac0
fix build error
Oct 1, 2012
bc3c5ee
Disable TestUlimit - it is failing the nightly build. Need to figure …
Oct 1, 2012
335f170
LocalJobRunner generates task completion events
Oct 1, 2012
0e4a8e7
RuntimeMonitor for running JUnit tests
Oct 1, 2012
7cb395b
Investigate more timeouts for failover test
Oct 1, 2012
1bb0188
Fix file descriptor leak in CachingAvatarZookeeperClient.
Oct 1, 2012
6031380
Make sure LocalityStats.record uses hostname instead of IPaddress
Oct 2, 2012
01e2933
Enable restoring failed storage by default
Oct 2, 2012
e876043
Make the port range smaller for avatar tests
Oct 2, 2012
176581d
Log error for after safemode processing
Oct 2, 2012
2d74458
Make streaming job input key/value serialization configurable
Oct 3, 2012
72dfb56
Purge corona system and history directories
Oct 3, 2012
583d5f3
Remove logging for invalidation of blocks when loading edits
Oct 3, 2012
20a5d4a
Pass digestinputstream to loader instead of DIS in FSImage.loadImage()
Oct 4, 2012
8cd6f78
OfflineImageDecompressor tool
Oct 4, 2012
10a0ca8
Introduce logic for max number of failed checkpoints
Oct 4, 2012
4d99c18
Disallow to call save namespace for avatarnode-standby
Oct 4, 2012
3dca2e5
[AUTOCONF] moving autoconf under hadoop-0.20
Oct 5, 2012
00a8761
Illegal pool names should not be allow in configured pools mode
Oct 5, 2012
c5502e7
Do not shutdown standby when failover does not succeed.
Oct 5, 2012
3db977d
Limit class name length for logging messages
Oct 5, 2012
476ea70
Prevent high-memory jobs from starting
Oct 5, 2012
22e9ece
[AUTOCONF] setting default values for config
Oct 5, 2012
e2263bc
Log the memory threshold when rejecting the job.
Oct 5, 2012
9d7180e
Allow killing succeeded tasks from web UI
Oct 5, 2012
78c38b3
Introduce retries for AvatarNode-ZK interaction
Oct 6, 2012
8ab0a0c
Fix testcase TestAvatarStartup
Oct 9, 2012
31e58bf
Deletion with hardlinks doesn't work - fix
Oct 9, 2012
d51e834
Enhance the unit tests for hardlinks
Oct 10, 2012
7ffbe2a
Instrument the number of reducers which process no data(Hadoop changes)
Oct 10, 2012
6ef0fcd
[BKJM-01-ivy] Add BookKeeper dependency to ivy
Oct 11, 2012
f41d433
Default max task memory = 4096m
Oct 11, 2012
b63e6ba
Revert "[BKJM-01-ivy] Add BookKeeper dependency to ivy"
Oct 12, 2012
8abb81d
Remove the possibility of not enabling directory restoration
Oct 12, 2012
db48e92
Fix ingest bugs
Oct 12, 2012
cf141e3
HDFS: fix TestFileAppend4
Oct 12, 2012
9ef51bf
Upgrade ivy to 2.1.0
Oct 12, 2012
56e05a3
Fix NPE in getFileChecksum
Oct 12, 2012
2636e26
[BKJM-01-ivy] Add BookKeeper dependency to ivy (re-submit)
Oct 12, 2012
0109abb
Avoid InetSocketAddress.getHostName()
Oct 12, 2012
402702f
Enable buffered IO for log4j
Oct 13, 2012
47e1e29
Check the log disk free space
Oct 13, 2012
6f92c33
Get the included hosts without copying
Oct 13, 2012
50c368f
Fix the cleanup of jtlogs
Oct 13, 2012
8a03075
Remove getFilesTotal() from FSNamesystem
Oct 13, 2012
c76329b
HDFS Inline Checksum
Oct 13, 2012
eb19365
Enhance AvatarShell to support an interactive CLI for failover.
Oct 13, 2012
50ff7c1
DFSClient.getFileChecksum() to try other data nodes for connection error
Oct 14, 2012
189442e
Fix TestOverReplicatedBlocks
Oct 15, 2012
a87cff0
Fixes for java7 - batch 2
Oct 16, 2012
32d2db0
Add syncing when writing to ZK.
Oct 16, 2012
d3e535e
Fixes for java7 - batch 1
Oct 16, 2012
00e3d37
Introduce timeout for http connection for image download
Oct 16, 2012
daadbce
corona: show the pool name for active jobs
Oct 16, 2012
6dae40f
Add purging of ckpt images
Oct 16, 2012
2ce295a
HDFS Inline Checksum to add version number in file name and eliminate…
Oct 16, 2012
184ecb7
Clean up checkpoint properly on primary/standby
Oct 16, 2012
d97e135
Fix the "ignoreKey" flag in Hadoop Streaming
Oct 16, 2012
0cc1beb
Fix NPE in transfer fsimage
Oct 17, 2012
02133e5
Fix TestAvatarContinuousFailover.
Oct 17, 2012
5b4526d
HDFS: Fix bug of truncateBlock() when no partial chunk
Oct 17, 2012
1b84935
Fix a regression caused by removing header from inline checksum
Oct 17, 2012
05a70ac
Fix several test cases for inline checksum
Oct 17, 2012
80523c2
Updated corona cluster manager freeslot counts to pages that provide …
Oct 17, 2012
218a4a7
Pulled revision 29960 that had improperly formatted log message
Oct 17, 2012
936507c
Provide more details of unallocated resources in Corona
Oct 17, 2012
5027192
HDFS: Fix TestDFSShell after removing header for inlinechecksum files
Oct 17, 2012
5125a4d
Fix problem with NPE in transfer fs image
Oct 17, 2012
02bded0
Remove ivy-2.0.0 jar, add ivy-2.1.0 jar
Nov 8, 2012
3e16997
Parse "True" and "1" as "true", "False" and "0" as "false"
Oct 18, 2012
7a51799
Simplify standby backup logic
Oct 18, 2012
a4e48be
Disable standby backup limits for tests
Oct 18, 2012
9c6f3d2
TestFileStatusExtended failure
Oct 19, 2012
1cee40e
Fix Date Time Format
Oct 19, 2012
238c612
add seqNo between Corona cluster manager and CoronaJobTracker
Oct 19, 2012
01e6000
Another source of mismatched generation stamp replicas
Oct 19, 2012
84c77a5
Fix OfferService ZK interaction
Oct 20, 2012
0998e98
Fix bug where datanode couldn't register and continuously sent block …
Oct 20, 2012
d287a5f
Inline checksums - fix problem with block report
Oct 21, 2012
0f88eb0
Make the injections to be the last statements of doCheckpoint
Oct 22, 2012
219d402
Fix TestAvatarCleanShutdown conflict with host file name
Oct 22, 2012
f26a45d
Enable private actions always coronajobdetails.jsp
Oct 22, 2012
85ca131
Fix the free disk space api.
Oct 22, 2012
7c9e522
Fixes for java7 - batch 3
Oct 22, 2012
3716d28
Fix broken link for failed tasks generated by corona job tracker
Oct 22, 2012
38a2095
Fix TestAvatarCheckpointing
Oct 22, 2012
a2eb33f
[DistCp Tos] support the TOS setting in DistCp
Oct 23, 2012
d066b60
[HDFS Namenode BugFix] process over replictaed block asynly
Oct 24, 2012
9d11b0c
Fix reporting of received blocks when they are rejected
Oct 24, 2012
095499b
Fix more failures in TestAvatarCheckpointing
Oct 24, 2012
f42e452
Online upgrade of task trackers
Oct 24, 2012
2a79ba7
Max Map-Tasks for Map-Reduce Jobs
Oct 24, 2012
c9b1771
Ant rule to generate a single jar with all classes
Oct 24, 2012
5ad5ffb
[Hdfs Fix] Fix the TestNativeIO
Oct 24, 2012
467c2df
Fix http://ci-builds.fb.com/job/build_hadoop_warehouse_percommit_trun…
Oct 24, 2012
5b8ef2f
Backup Corona releases so that Corona job tracker will not be interru…
Oct 24, 2012
ff5321c
No exception if releaseDir is not setup in conf for CoronaReleaseManager
Oct 25, 2012
608acf4
Make storage directory failures consistent
Oct 25, 2012
33684aa
Fix problem with handling failed journals
Oct 26, 2012
c67f9e4
Create unit test for CoronaReleaseManager from https://phabricator.f…
Oct 26, 2012
5763288
Trace more details (and cleanup)
Oct 26, 2012
2a3769b
Reducer TaskCompletion events are not at all populated for the hive q…
Oct 26, 2012
dc71257
Speed up offline image viewer
Oct 26, 2012
6c7c8b6
Fail namenode when journal set is empty or editlog stream is null
Oct 26, 2012
e504354
setTimes doesn't change the modificationTime of a working copy's tag …
Oct 26, 2012
1ae56c1
Fix TestAvatarStartup
Oct 27, 2012
0cc09bf
Add check to avoid NPE in FSNamesystem close.
Oct 28, 2012
a99a486
Revert "Fix TestAvatarStartup"
Oct 28, 2012
ceb7e99
DFSAdmin#getBlockInfo prints block generation stamp
Oct 28, 2012
eb3d90c
Disallow non-initial block reports to add mismatched GS replicas into…
Oct 29, 2012
f81d7d0
Fix TestAvatarStartup - attempt2
Oct 29, 2012
ea61b9b
More optimizations for offline image viewer
Oct 29, 2012
f0496a5
Command line interface for testing expire files logic
Oct 29, 2012
f2d2ca2
Memory leak on standby's
Oct 30, 2012
9d29e9f
Fix AvatarNode cleaner interval
Oct 30, 2012
0aec31d
Remove excessive logging
Oct 31, 2012
4206b06
Enhance metasave for saving excess blocks
Oct 31, 2012
6d27e04
Reduce pendingReplicationCheck granularity
Oct 31, 2012
6b4087b
Fix TestPersistTxId after Injection handler refactorization.
Oct 31, 2012
5fc7c5d
Fix bin-package for snappy
Nov 1, 2012
a6d251a
Enabled job file cleanup always
Nov 1, 2012
9894499
Fix task error counters
Nov 2, 2012
9598389
Added TestCompletionEvents
Nov 3, 2012
9c3ee00
Add getRedirect to ClusterManager
Nov 5, 2012
26c6d6a
Commiting Thrift Generated files
Nov 6, 2012
2e30072
HDFS: Fix DFSClient's memory leak on OutputStreams hold by FileChecker
Nov 6, 2012
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Do not update replication queues in removeStoredBlock when replicatio…
…n queues are not initialized.

Summary:
There was a bug that the Standby hit because even though the
Standby never intializes replication queues, it was incorrectly adding
blocks to its under replication queue. This caused a continuously
increasing missing block count on the Standby.

Test Plan: All unit tests.

Reviewers: hkuang, tomasz, weiyan, sdong

Reviewed By: tomasz

Task ID: 1742688
  • Loading branch information
pritam authored and Alex Feinberg committed Nov 8, 2012
commit da6d0f4c314da4fe2330db7aa34aa680ab5dbaa2
Original file line number Diff line number Diff line change
@@ -6048,11 +6048,14 @@ private void removeStoredBlock(Block block, DatanodeDescriptor node) {
decrementSafeBlockCount(block);
// handle under replication
// Use storedBlock here because block maybe a deleted block with size DELETED
NumberReplicas num = countNodes(storedBlock);
int numCurrentReplicas = num.liveReplicas() +
if (isPopulatingReplQueues()) {
NumberReplicas num = countNodes(storedBlock);
int numCurrentReplicas = num.liveReplicas()
+
pendingReplications.getNumReplicas(storedBlock);
updateNeededReplicationQueue(storedBlock, -1, numCurrentReplicas,
num.decommissionedReplicas, node, fileINode.getReplication());
updateNeededReplicationQueue(storedBlock, -1, numCurrentReplicas,
num.decommissionedReplicas, node, fileINode.getReplication());
}
}

//