Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#295] fix failing with OverlongHeaders #315

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion package.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,8 @@ library:
- ftp-client
- crypton-connection
- Glob
- http-client
- http-client >= 0.7.17
- http-client-tls
- http-types
- lens
- modern-uri
Expand Down Expand Up @@ -155,6 +156,7 @@ tests:
- warp
- scotty
- http-types
- http-client
- lens
- modern-uri
- nyan-interpolation
Expand All @@ -164,6 +166,7 @@ tests:
- tasty
- tasty-hunit
- tasty-quickcheck
- text
- time
- universum
- uri-bytestring
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just global comment: Make sure commit names start with uppercase, and use infinitive.

  • [#xxx] Fix OverlongHeaders
  • [#xxx] Add tests

Other than that, looks good, thanks for using Problem/Solution template for commit description 👍

Expand Down
11 changes: 10 additions & 1 deletion src/Xrefcheck/CLI.hs
Original file line number Diff line number Diff line change
Expand Up @@ -103,12 +103,14 @@ addExclusionOptions ExclusionConfig{..} (ExclusionOptions ignore) =

data NetworkingOptions = NetworkingOptions
{ noMaxRetries :: Maybe Int
, noMaxHeaderLength :: Maybe Int
}

addNetworkingOptions :: NetworkingConfig -> NetworkingOptions -> NetworkingConfig
addNetworkingOptions NetworkingConfig{..} (NetworkingOptions maxRetries) =
addNetworkingOptions NetworkingConfig{..} (NetworkingOptions maxRetries maxHeaderLength) =
NetworkingConfig
{ ncMaxRetries = fromMaybe ncMaxRetries maxRetries
, ncMaxHeaderLength = fromMaybe ncMaxHeaderLength maxHeaderLength
, ..
}

Expand Down Expand Up @@ -228,6 +230,13 @@ networkingOptionsParser = do
value Nothing <>
help "How many attempts to retry an external link after getting \
\a \"429 Too Many Requests\" response."

noMaxHeaderLength <- option (Just <$> auto) $
long "header-limit" <>
metavar "INT" <>
value Nothing <>
help "The maximum allowed total size of HTTP headers (in bytes) \
\ that can be returned by the server."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: you insert two spaces between (in bytes) and that. I would strip the one before that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, let's strip extending CLI options. The code is good, the reason is that, in this project, CLI arguments serve to tune how verification is run, and config serves to tune which set of errors should be covered. If I run xrefcheck for the first time on any given project, I shouldn't need to know that I have to specify --header-limit option.

So having header length limit specified in config should suffice.

return NetworkingOptions{..}

dumpConfigOptions :: Parser Command
Expand Down
14 changes: 11 additions & 3 deletions src/Xrefcheck/Command.hs
Original file line number Diff line number Diff line change
Expand Up @@ -7,18 +7,22 @@ module Xrefcheck.Command
( defaultAction
) where

import Universum
import Universum hiding ((.~))

import Control.Lens ((.~))

import Data.Reflection (Given, give)
import Data.Yaml (decodeFileEither, prettyPrintParseException)
import Fmt (build, fmt, fmtLn)
import System.Console.Pretty (supportsPretty)
import System.Directory (doesFileExist)
import Text.Interpolation.Nyan
import Network.HTTP.Client (newManager, managerSetMaxHeaderLength)
import Network.HTTP.Client.TLS (tlsManagerSettings)

import Xrefcheck.CLI (Options (..), addExclusionOptions, addNetworkingOptions, defaultConfigPaths)
import Xrefcheck.Config
(Config, Config' (..), ScannersConfig, ScannersConfig' (..), defConfig, overrideConfig)
(Config, Config' (..), NetworkingConfig' (..), ScannersConfig, ScannersConfig' (..), defConfig, overrideConfig, cNetworkingL, ncHttpManagerL)
import Xrefcheck.Core (Flavor (..))
import Xrefcheck.Progress (allowRewrite)
import Xrefcheck.Scan
Expand Down Expand Up @@ -87,8 +91,12 @@ defaultAction Options{..} = do
whenJust (nonEmpty $ sortBy (compare `on` seFile) scanErrs) reportScanErrs

verifyRes <- allowRewrite showProgressBar $ \rw -> do
let fullConfig = config
let parsedConfig = config
{ cNetworking = addNetworkingOptions (cNetworking config) oNetworkingOptions }

mgr <- newManager $ managerSetMaxHeaderLength (ncMaxHeaderLength (cNetworking parsedConfig)) tlsManagerSettings
let fullConfig = parsedConfig & cNetworkingL . ncHttpManagerL .~ Just mgr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better not to mix config and things created at runtime, and propagate the manager as separate argument. Shouldn't require too much chances as far as I can tell.


verifyRepo rw fullConfig oMode repoInfo

case verifyErrors verifyRes of
Expand Down
19 changes: 19 additions & 0 deletions src/Xrefcheck/Config.hs
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ import Data.Aeson (genericParseJSON)
import Data.Yaml (FromJSON (..), decodeEither', prettyPrintParseException, withText)
import Text.Regex.TDFA.Text ()
import Time (KnownRatName, Second, Time (..), unitsP)
import Network.HTTP.Client (Manager)

import Xrefcheck.Config.Default
import Xrefcheck.Core
Expand Down Expand Up @@ -85,6 +86,19 @@ data NetworkingConfig' f = NetworkingConfig
-- chain.
, ncExternalRefRedirects :: Field f RedirectConfig
-- ^ Rules to override the redirect behavior for external references.
, ncMaxHeaderLength :: Field f Int
-- ^ The maximum allowed total size of HTTP headers (in bytes) that can
-- be returned by the server.
--
-- If the total size of the headers exceeds this value, the request will
-- fail with an error to prevent the processing of excessively large headers.
, ncHttpManager :: Field f (Maybe Manager)
-- ^ A custom HTTP Manager used for all HTTP requests.
--
-- Using the same implicit global manager for provides maximal connection
-- sharing.
--
-- If 'Nothing', a default manager will be used.
} deriving stock (Generic)

-- | A list of custom redirect rules.
Expand Down Expand Up @@ -151,6 +165,8 @@ overrideConfig config
, ncMaxTimeoutRetries = overrideField ncMaxTimeoutRetries
, ncMaxRedirectFollows = overrideField ncMaxRedirectFollows
, ncExternalRefRedirects = overrideField ncExternalRefRedirects
, ncMaxHeaderLength = overrideField ncMaxHeaderLength
, ncHttpManager = overrideField ncHttpManager
}
where
overrideField :: (forall f. NetworkingConfig' f -> Field f a) -> a
Expand Down Expand Up @@ -181,3 +197,6 @@ instance FromJSON (ScannersConfig) where

instance FromJSON (ScannersConfig' Maybe) where
parseJSON = genericParseJSON aesonConfigOption

instance FromJSON Manager where
parseJSON _ = fail "Manager field is not configurable"
7 changes: 7 additions & 0 deletions src/Xrefcheck/Config/Default.hs
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,13 @@ networking:
externalRefRedirects:
#{interpolateIndentF 4 externalRefRedirects}

# The maximum allowed total size of HTTP headers (in bytes) that can
# be returned by the server.
#
# If the total size of the headers exceeds this value, the request will
# fail with an error to prevent the processing of excessively large headers.
maxHeaderLength: 4096

# Parameters of scanners for various file types.
scanners:
# On 'anchor not found' error, how much similar anchors should be displayed as
Expand Down
15 changes: 13 additions & 2 deletions src/Xrefcheck/Verify.hs
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ import Network.HTTP.Client
import Network.HTTP.Req
(AllowsBody, CanHaveBody (NoBody), GET (..), HEAD (..), HttpBodyAllowed,
HttpConfig (httpConfigRedirectCount), HttpException (..), HttpMethod, NoReqBody (..),
defaultHttpConfig, ignoreResponse, req, runReq, useURI)
defaultHttpConfig, ignoreResponse, req, runReq, useURI, httpConfigAltManager)
import Network.HTTP.Types.Header (hRetryAfter)
import Network.HTTP.Types.Status (Status, statusCode, statusMessage)
import Text.Interpolation.Nyan
Expand Down Expand Up @@ -136,6 +136,7 @@ data VerifyError
| RedirectMissingLocation RedirectChain
| RedirectChainLimit RedirectChain
| RedirectRuleError RedirectChain (Maybe RedirectRuleOn)
| MaxHeaderLengthError Int
deriving stock (Show, Eq)

data ResponseResult
Expand Down Expand Up @@ -287,6 +288,11 @@ pprVerifyErr' rInfo = \case
Just RROTemporary -> "Temporary redirect"
Just (RROCode code) -> show code <> " redirect"

MaxHeaderLengthError len ->
[int||
The total size of the response headers exceeds the limit of #{len} bytes.
|] <> pprLinkCtx rInfo

attachToRedirectChain :: RedirectChain -> Text -> Builder
attachToRedirectChain chain attached
= build chain <> build attachedText
Expand Down Expand Up @@ -718,7 +724,10 @@ checkExternalResource followed config@Config{..} link
_ -> makeHttpRequest uri GET 0.7

httpConfig :: HttpConfig
httpConfig = defaultHttpConfig { httpConfigRedirectCount = 0 }
httpConfig = defaultHttpConfig
{ httpConfigRedirectCount = 0
, httpConfigAltManager = ncHttpManager
}

makeHttpRequest
:: (HttpMethod method, HttpBodyAllowed (AllowsBody method) 'NoBody)
Expand Down Expand Up @@ -812,6 +821,8 @@ checkExternalResource followed config@Config{..} link
| Just (N.C.HostCannotConnect _ _) <- fromException e
-> throwError ExternalResourceConnectionFailure

OverlongHeaders -> throwError $ MaxHeaderLengthError ncMaxHeaderLength

other -> throwError $ ExternalResourceSomeError $ show other
where
retryAfterInfo :: Response a -> Maybe RetryAfter
Expand Down
74 changes: 74 additions & 0 deletions tests/Test/Xrefcheck/MaxHeaderLengthSpec.hs
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
{- SPDX-FileCopyrightText: 2021 Serokell <https://serokell.io>
-
- SPDX-License-Identifier: MPL-2.0
-}

module Test.Xrefcheck.MaxHeaderLengthSpec where

import Universum hiding ((.~))

import Control.Lens ((.~))
import Data.Set qualified as S
import Network.HTTP.Client (newManager, managerSetMaxHeaderLength, defaultManagerSettings)
import Network.HTTP.Types (ok200)
import Network.Wai qualified as Web
import Test.Tasty (TestTree, testGroup)
import Test.Tasty.HUnit (testCase)
import Web.Scotty qualified as Web
import qualified Data.Text as T
import qualified Data.Text.Lazy as TL

import Test.Xrefcheck.UtilRequests
import Xrefcheck.Config
import Xrefcheck.Progress
import Xrefcheck.Verify

mockHeader :: Int -> IO Web.Application
mockHeader size = Web.scottyApp $ do
Web.matchAny "/header" $ do
Web.setHeader "X-header" (TL.fromStrict $ T.replicate size "x")
Web.status ok200

test_maxHeaderLength :: TestTree
test_maxHeaderLength = testGroup "MaxHeaderLength tests"
[ testCase "Succeeds with small header" $ do
setRef <- newIORef S.empty
mgr <- newManager $ managerSetMaxHeaderLength mhl defaultManagerSettings
checkMultipleLinksWithServer
(5001, mockHeader (mhl `div` 2))
setRef
[ VerifyLinkTestEntry
{ vlteConfigModifier = \c -> c
& cNetworkingL . ncMaxHeaderLengthL .~ mhl
& cNetworkingL . ncHttpManagerL .~ Just mgr
, vlteLink = "http://127.0.0.1:5001/header"
, vlteExpectedProgress = mkProgressWithOneTask True
, vlteExpectationErrors = VerifyResult []
}
]

, testCase "Fails with MaxHeaderLengthError" $ do
setRef <- newIORef S.empty
mgr <- newManager $ managerSetMaxHeaderLength mhl defaultManagerSettings
checkMultipleLinksWithServer
(5002, mockHeader (mhl*2))
setRef
[ VerifyLinkTestEntry
{ vlteConfigModifier = \c -> c
& cNetworkingL . ncMaxHeaderLengthL .~ mhl
& cNetworkingL . ncHttpManagerL .~ Just mgr
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If manager will be not part of config, its creation will likely be somewhere inside checkMultipleLinksWithServer, maybe right in verifyReferenceWithProgress even.

, vlteLink = "http://127.0.0.1:5002/header"
, vlteExpectedProgress = mkProgressWithOneTask False
, vlteExpectationErrors = VerifyResult [MaxHeaderLengthError mhl]
}
]
]
where
mhl = 4096

mkProgressWithOneTask shouldSucceed = report "" $ initProgress 1
where
report =
if shouldSucceed
then reportSuccess
else reportError
7 changes: 7 additions & 0 deletions tests/configs/github-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,13 @@ networking:
- on: permanent
outcome: invalid

# The maximum allowed total size of HTTP headers (in bytes) that can
# be returned by the server.
#
# If the total size of the headers exceeds this value, the request will
# fail with an error to prevent the processing of excessively large headers.
maxHeaderLength: 4096

# Parameters of scanners for various file types.
scanners:
# On 'anchor not found' error, how much similar anchors should be displayed as
Expand Down